Skip to content

Conversation

@sujay-d07
Copy link
Collaborator

@sujay-d07 sujay-d07 commented Dec 24, 2025

Related Issue

Closes #357

Summary

This pull request introduces major improvements to Cortex's LLM integration, making local, privacy-first inference via Ollama the new default. It adds a robust Ollama setup flow, updates documentation and installation instructions, and refines model selection and LLM routing logic. The changes ensure users can use Cortex completely offline, with cloud fallback as an option, and no API keys required for basic usage. The most important changes are grouped below:

Ollama Integration and Local LLM Support:

  • Added a new post-install hook (cortex/_post_install.py) that automatically runs Ollama setup after installation, including prompts for model selection and configuration.
  • Introduced the OllamaProvider class and integrated it throughout the codebase, making local LLM inference via Ollama the default for all tasks. [1] [2] Ff49def5L25, Ff49def5L48)
  • Updated the first-run wizard and model recommendation logic to default to codellama:7b and use new model naming conventions (e.g., phi3:mini, llama3:8b). [1] [2] [3]
  • Enhanced the LLM router to prioritize Ollama (local) first, followed by cloud providers (Claude, Kimi K2, OpenAI), and updated provider enums and documentation. [1] [2] [3] [4]

Documentation and User Experience:

  • Overhauled installation and usage instructions in README.md and AGENTS.md to emphasize privacy-first local LLMs, offline capability, and optional API keys; added detailed instructions for managing Ollama models and setup. [1] [2] [3] [4] [5] [6] [7] [8]
  • Added comprehensive changelog entries and updated manifest to include new scripts and documentation. [1] [2]

Robustness and Error Handling:

  • Improved parsing of LLM responses to handle markdown code blocks and partial JSON, with better error logging for debugging. [1] [2]

Important Commands:

Install (Interactive Ollama Setup)
python setup.py develop

Full Setup (From Scratch)

git clone https://github.com/cortexlinux/cortex.git
cd cortex
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip setuptools wheel
python setup.py develop

Verify Installation

cortex --help
cortex install nginx --dry-run

Install Ollama Later (If Skipped)

python setup.py develop

Testing

# All tests
pytest tests/ -v

Ollama-specific tests
pytest tests/test_ollama_integration.py tests/test_ollama_setup_integration.py -v

Migration (Old Install → New)

pip uninstall cortex-linux
python setup.py develop

Summary by CodeRabbit

  • New Features

    • Local LLM support via Ollama (privacy-first, offline-capable, zero-cost) with interactive setup and a new setup command and demo.
  • Changed

    • Routing now prefers local Ollama with automatic cloud fallback; API keys optional for cloud usage.
    • Default model recommendations and first-run flow updated for local-first operation; Python 3.10+ required.
  • Documentation

    • New Ollama quickstart, integration, auto-setup, and usage guides; README updated with privacy/offline messaging.
  • Tests

    • New integration tests and examples covering Ollama workflows and routing behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 24, 2025 18:44
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a local-first Ollama provider and integration: new OllamaProvider, LLMRouter changes to prefer Ollama with cloud fallbacks, provider/package wiring and post-install auto-setup, interpreter/default model updates, docs/examples/config, tests, and test output showing several Ollama connectivity/routing failures.

Changes

Cohort / File(s) Summary
Core Ollama Provider Implementation
cortex/providers/ollama_provider.py, cortex/providers/__init__.py
New OllamaProvider with install/start checks, model discovery/selection/pull, ensure_model_available, sync & streaming completion API, model introspection, and package re-export.
LLM Routing & Provider Integration
cortex/llm_router.py
Adds LLMProvider.OLLAMA, defaults many routes to Ollama, prefer_local ctor param, availability/model checks, _complete_ollama() path, fallback to cloud providers, cost/stats updates, and attempted-provider tracking.
Interpreter & First-run Defaults
cortex/llm/interpreter.py, cortex/first_run_wizard.py
Default Ollama model set to codellama:7b; _call_ollama() refactored to use OllamaProvider; improved JSON/command extraction; first-run messaging and model-pull command updated.
Hardware & Model Recommendations
cortex/kernel_features/hardware_detect.py
Updated recommended models for >=8GB to Ollama-style identifiers (phi3:mini, llama3:8b, gemma:2b, qwen2.5:7b).
Post-install Setup Automation & CLI
setup.py, cortex/_post_install.py, scripts/setup_ollama.py, scripts/__init__.py
Post-install/develop/egg_info hooks invoke setup_ollama(), new console entry cortex-setup-ollama, auto-install/start/pull flow with CI/non-interactive guards and marker file; setup wiring and packaging updates.
Examples, Config & Scripts
examples/ollama_demo.py, examples/sample-config.yaml, scripts/verify_ollama_setup.sh
Demo showing Ollama status, local completion, router fallback; sample config adds llm block for Ollama and cloud fallbacks; verification script for auto-setup.
Tests & Test Output
tests/test_ollama_integration.py, tests/test_llm_router.py, tests/test_cli.py, tests/test_ollama_setup_integration.py, tests/integration/*, test_output.txt
New Ollama integration tests and updated router/CLI tests reflecting Ollama-first routing and fallbacks; test run log shows multiple failures related to Ollama connectivity and routing expectations.
Docs & Packaging
CHANGELOG.md, README.md, docs/*, AGENTS.md, MANIFEST.in, pyproject.toml, setup.py
Extensive docs added/updated for Ollama integration and auto-setup; manifest includes scripts/; setup adds post-install hooks and cortex-setup-ollama entry point; license metadata formatting adjusted.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LLMRouter
    participant OllamaProvider
    participant Local_Ollama
    participant CloudProvider

    User->>LLMRouter: complete(task, messages)
    activate LLMRouter

    rect rgb(232,245,233)
    Note over LLMRouter,OllamaProvider: Local-first path — ensure service & model
    LLMRouter->>OllamaProvider: ensure_model_available()
    activate OllamaProvider
    OllamaProvider->>Local_Ollama: is_running() / start_service()
    Local_Ollama-->>OllamaProvider: ready
    OllamaProvider->>Local_Ollama: /api/generate (prompt)
    Local_Ollama-->>OllamaProvider: streamed response
    OllamaProvider-->>LLMRouter: LLMResponse (Ollama)
    deactivate OllamaProvider
    end

    rect rgb(255,248,225)
    Note over LLMRouter,CloudProvider: Fallback if Ollama unavailable or errors
    alt Ollama unavailable or error
        LLMRouter->>CloudProvider: complete() via Claude/Kimi/OpenAI
        CloudProvider-->>LLMRouter: completion
    end
    end

    LLMRouter->>User: return LLMResponse
    deactivate LLMRouter
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mikejmorgan-ai
  • Suyashd999

Poem

🐰 I hopped through code and pulled a friend,

Ollama local, no API to lend.
Models snug in my burrowed den,
Streaming answers, safe for when.
Carrots, code, and quiet zen.

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding Ollama integration for local LLM support. It directly reflects the primary objective of the changeset.
Linked Issues check ✅ Passed The PR implementation successfully addresses all primary objectives from issue #357: Ollama auto-detection, smart model selection, streaming support, offline operation, cloud fallback routing, multiple model support, and comprehensive documentation/tests.
Out of Scope Changes check ✅ Passed All changes are directly related to Ollama integration objectives. Minor updates to pyproject.toml (license format) and AGENTS.md installation instructions are necessary ancillary changes supporting the core feature.
Docstring Coverage ✅ Passed Docstring coverage is 89.87% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering all major changes including Ollama integration, documentation updates, robustness improvements, and testing instructions.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive local LLM support to Cortex Linux via Ollama integration, enabling privacy-first, offline-capable package management without cloud API dependencies. The implementation includes automated setup, intelligent provider routing with fallback, and extensive documentation.

Key Changes

  • New OllamaProvider with full model management and inference capabilities
  • Updated LLM router to prioritize local Ollama over cloud providers (Claude/OpenAI)
  • Automated post-install setup script for seamless user experience
  • Comprehensive test suite covering provider functionality, routing, and fallback logic

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tests/test_ollama_integration.py New test suite with 221 lines covering Ollama integration
tests/test_llm_router.py Updated routing tests to accommodate Ollama-first priority
tests/test_cli.py Modified CLI test to reflect API-key-optional behavior
cortex/providers/ollama_provider.py New 439-line provider implementation for local LLM
cortex/llm_router.py Enhanced routing with Ollama priority and smart fallback
setup.py Added post-install hooks for automated Ollama setup
scripts/setup_ollama.py 259-line setup script for installation and configuration
examples/ollama_demo.py Interactive demo showcasing Ollama capabilities
Documentation files Comprehensive guides including quickstart and integration docs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 54 to 58
from requests.exceptions import ConnectionError
mock_get.side_effect = ConnectionError("Connection refused")

provider2 = OllamaProvider()
self.assertFalse(provider2.is_running())
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects ConnectionError to be raised when mocking a failed connection, but the exception type is incorrect. The code should raise requests.exceptions.ConnectionError, not a generic Exception("Connection refused"). The mock is setting side_effect = ConnectionError("Connection refused"), but this should be requests.exceptions.ConnectionError("Connection refused") for proper exception handling.

Copilot uses AI. Check for mistakes.
Comment on lines 466 to 475
# System operation should try Ollama first, then fallback chain
response = router.complete(
messages=[{"role": "user", "content": "Install CUDA"}],
task_type=TaskType.SYSTEM_OPERATION,
)

self.assertEqual(response.provider, LLMProvider.CLAUDE)
self.assertEqual(response.content, "Fallback response")
# With Ollama integration, could be Ollama or any fallback provider
self.assertIn(response.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE, LLMProvider.KIMI_K2])
# Check response content exists
self.assertIsNotNone(response.content)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects Claude as the fallback provider but with Ollama integration, the actual provider returned is OLLAMA. The test needs to be updated to reflect the new fallback priority: Ollama → Claude → Kimi K2.

Copilot uses AI. Check for mistakes.
Comment on lines 63 to 66
def test_install_no_api_key(self):
# With Ollama integration, no API key is needed - should succeed
result = self.cli.install("docker")
self.assertEqual(result, 1)
self.assertEqual(result, 0)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects a return code of 1 (failure) when no API key is provided, but with Ollama integration, the system no longer requires API keys and returns 0 (success). The test expectation should be updated to reflect that Ollama enables usage without API keys.

Copilot uses AI. Check for mistakes.
Comment on lines 145 to 152
try:
response = requests.get(
f"{self.base_url}/api/tags",
timeout=5
)
return response.status_code == 200
except requests.RequestException:
return False
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling catches a broad Exception but the actual exception type raised by requests is ConnectionError. The code should specifically handle requests.exceptions.RequestException or requests.exceptions.ConnectionError for better error handling clarity.

Copilot uses AI. Check for mistakes.
Comment on lines 171 to 190
mock_ollama.complete.return_value = {
"response": "Install nginx using apt-get",
"model": "llama3:8b"
}
mock_ollama_class.return_value = mock_ollama

router = LLMRouter()
router.ollama_client = mock_ollama # Ensure router uses our mock

messages = [{"role": "user", "content": "How to install nginx?"}]
response = router.complete(
messages=messages,
task_type=TaskType.SYSTEM_OPERATION,
force_provider=LLMProvider.OLLAMA
)

self.assertEqual(response.provider, LLMProvider.OLLAMA)
# Check that complete was called on the mock
mock_ollama.complete.assert_called_once()
self.assertIn("nginx", response.content.lower())
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is not properly mocking the Ollama provider's complete method return value. The mock returns a MagicMock object, but the test expects the response content to be a string containing "nginx". The mock should return a dict with a "response" key containing the actual content string, not a MagicMock object.

Copilot uses AI. Check for mistakes.
return False

except subprocess.TimeoutExpired:
logger.warning("⚠️ Model download timed out - try again with: ollama pull {model_name}")
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format string contains a non-existent variable reference. The warning message uses {model_name} but this variable is not in scope at line 204. This will cause an error if the timeout branch is executed. The variable should be referenced from the function context or removed from the message.

Suggested change
logger.warning("⚠️ Model download timed out - try again with: ollama pull {model_name}")
logger.warning(f"⚠️ Model download timed out - try again with: ollama pull {model_name}")

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +62
def __init__(
self,
base_url: str | None = None,
timeout: int = 300,
auto_pull: bool = True,
):
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function definition uses Python 3.10+ union type syntax str | None but also includes type hints that may not be compatible with all Python versions. Consider using Optional[str] from typing for better backwards compatibility, or ensure the codebase consistently requires Python 3.10+.

Copilot uses AI. Check for mistakes.
Comment on lines 124 to 129
subprocess.run(["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True)
time.sleep(2) # Give service time to start
return True
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hardcoded sleep times for service startup is unreliable and can cause race conditions. The service might not be ready after 2 seconds on slower systems, or this could waste time on faster systems. Consider implementing a proper retry loop with health checks instead of fixed sleep durations.

Suggested change
subprocess.run(["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True)
time.sleep(2) # Give service time to start
return True
subprocess.run(
["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
# Wait for Ollama service to become ready instead of using a fixed sleep
start_time = time.time()
timeout_seconds = 30
health_url = "http://localhost:11434/api/tags"
while time.time() - start_time < timeout_seconds:
try:
response = requests.get(health_url, timeout=2)
if response.status_code == 200:
logger.info("✅ Ollama service is running")
return True
except requests.RequestException:
# Service not ready yet; keep retrying until timeout
pass
time.sleep(0.5)
logger.error("Ollama service did not become ready within the expected time")
return False

Copilot uses AI. Check for mistakes.
"""

import unittest
from unittest.mock import Mock, patch, MagicMock
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'MagicMock' is not used.

Suggested change
from unittest.mock import Mock, patch, MagicMock
from unittest.mock import Mock, patch

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
cortex/kernel_features/hardware_detect.py (1)

346-377: Fix inconsistent model naming: all models must use Ollama's colon format (model:tag).

Lines 355, 360, 364, 370, and 374 use dash notation (e.g., llama3.1-70b-q4, mistral-7b), which is incompatible with Ollama's /api/pull API. Line 367 correctly uses colon notation (phi3:mini, llama3:8b). The OllamaProvider will fail to pull models with incorrect names.

Convert all model names to colon format:

  • llama3.1-70b-q4llama3.1:70b-q4
  • qwen2.5-72b-q4qwen2.5:72b-q4
  • codellama-34b-q4codellama:34b-q4
  • mistral-7bmistral:7b
  • phi-2phi3:2.5b (or verify correct Ollama model name)
  • tinyllama-1.1btinyllama:1.1b

For NPU variants (line 374: phi-3-mini-npu, qwen2.5-1.5b-npu), verify if these exist in Ollama registry before including them.

README.md (1)

370-375: Project status section is inconsistent with this PR.

Line 373 lists "Ollama local model support" under "In Progress", but this PR implements full Ollama integration. Update the status to reflect the completed feature.

🔎 Proposed fix
 ### In Progress
 - [ ] Conflict resolution UI
 - [ ] Multi-step orchestration
-- [ ] Ollama local model support
 - [ ] MCP server integration
 - [ ] Snap/Flatpak support
+
+### Completed
+- [x] Natural language to package resolution
+- [x] Claude and OpenAI integration
+- [x] Installation history and rollback
+- [x] User preferences (YAML-backed)
+- [x] Hardware detection (GPU/CPU/Memory)
+- [x] Firejail sandboxing
+- [x] Dry-run preview mode
+- [x] Ollama local model support
cortex/llm_router.py (2)

521-544: Ollama stats missing from get_stats() return value.

The provider_stats dictionary tracks Ollama usage (line 203), but get_stats() only returns stats for Claude and Kimi K2, omitting Ollama entirely.

🔎 Proposed fix
         with self._stats_lock:
             return {
                 "total_requests": self.request_count,
                 "total_cost_usd": round(self.total_cost_usd, 4),
                 "providers": {
+                    "ollama": {
+                        "requests": self.provider_stats[LLMProvider.OLLAMA]["requests"],
+                        "tokens": self.provider_stats[LLMProvider.OLLAMA]["tokens"],
+                        "cost_usd": round(self.provider_stats[LLMProvider.OLLAMA]["cost"], 4),
+                    },
                     "claude": {
                         "requests": self.provider_stats[LLMProvider.CLAUDE]["requests"],
                         "tokens": self.provider_stats[LLMProvider.CLAUDE]["tokens"],
                         "cost_usd": round(self.provider_stats[LLMProvider.CLAUDE]["cost"], 4),
                     },
                     "kimi_k2": {
                         "requests": self.provider_stats[LLMProvider.KIMI_K2]["requests"],
                         "tokens": self.provider_stats[LLMProvider.KIMI_K2]["tokens"],
                         "cost_usd": round(self.provider_stats[LLMProvider.KIMI_K2]["cost"], 4),
                     },
                 },
             }

591-626: acomplete() doesn't support Ollama routing.

The async completion method only handles Claude and Kimi K2 providers (lines 592-595), but route_task() can return LLMProvider.OLLAMA. This will cause Ollama requests to incorrectly fall through to the Kimi K2 path.

🔎 Proposed fix
         try:
-            if routing.provider == LLMProvider.CLAUDE:
+            if routing.provider == LLMProvider.OLLAMA:
+                # Ollama doesn't have native async - run sync in executor
+                loop = asyncio.get_event_loop()
+                response = await loop.run_in_executor(
+                    None,
+                    lambda: self._complete_ollama(messages, temperature, max_tokens)
+                )
+            elif routing.provider == LLMProvider.CLAUDE:
                 response = await self._acomplete_claude(messages, temperature, max_tokens, tools)
             else:  # KIMI_K2
                 response = await self._acomplete_kimi(messages, temperature, max_tokens, tools)

Also update the fallback logic (lines 609-614) to include Ollama:

             if self.enable_fallback:
-                fallback_provider = (
-                    LLMProvider.KIMI_K2
-                    if routing.provider == LLMProvider.CLAUDE
-                    else LLMProvider.CLAUDE
-                )
+                if routing.provider == LLMProvider.OLLAMA:
+                    fallback_provider = (
+                        LLMProvider.CLAUDE if self.claude_client_async
+                        else LLMProvider.KIMI_K2 if self.kimi_client_async
+                        else None
+                    )
+                elif routing.provider == LLMProvider.CLAUDE:
+                    fallback_provider = (
+                        LLMProvider.KIMI_K2 if self.kimi_client_async
+                        else None
+                    )
+                else:  # KIMI_K2
+                    fallback_provider = (
+                        LLMProvider.CLAUDE if self.claude_client_async
+                        else None
+                    )
+                
+                if not fallback_provider:
+                    raise RuntimeError("No fallback provider available")
🧹 Nitpick comments (11)
docs/PR_OLLAMA_INTEGRATION.md (2)

60-62: Add language identifier to fenced code block.

The fallback logic code block is missing a language identifier. Consider adding text or mermaid for proper rendering.

🔎 Proposed fix
-```
+```text
 Ollama (local) → Claude → Kimi K2 → Error
</details>

---

`306-307`: **Format URLs as markdown links.**

Bare URLs should be properly formatted as markdown links for better readability and accessibility.



<details>
<summary>🔎 Proposed fix</summary>

```diff
-- **Discord:** https://discord.gg/uCqHvxjU83
-- **Email:** mike@cortexlinux.com
+- **Discord:** [https://discord.gg/uCqHvxjU83](https://discord.gg/uCqHvxjU83)
+- **Email:** [mike@cortexlinux.com](mailto:mike@cortexlinux.com)
docs/OLLAMA_INTEGRATION.md (1)

42-42: Specify language for code block.

The bash command should be wrapped in a code block with language specification for proper syntax highlighting.

🔎 Proposed fix
+```bash
 cortex install nginx
+```
cortex/llm/interpreter.py (1)

144-180: Verify error message and exception type consistency.

The refactored _call_ollama method has good integration with OllamaProvider, but there's a minor inconsistency:

Line 159 error message says "No Ollama models available. Run: ollama pull llama3:8b", but the code now uses codellama:7b as the default model (line 65). The error message should suggest pulling the default model for consistency.

🔎 Proposed fix
             model = ollama.ensure_model_available()
             if not model:
-                raise RuntimeError("No Ollama models available. Run: ollama pull llama3:8b")
+                raise RuntimeError("No Ollama models available. Run: ollama pull codellama:7b")
examples/ollama_demo.py (1)

21-22: Relative path assumption may fail in different execution contexts.

Line 22 uses sys.path.insert(0, '..') to add the parent directory to the Python path. This assumes the script is always run from the examples/ directory, which may not hold true if:

  • The script is run from the project root
  • The script is executed via absolute path from another directory
  • The script is installed as a package
🔎 Proposed fix

Use __file__ to determine the script's location reliably:

+import os
 import sys
 import time
 from rich.console import Console
 from rich.table import Table
 from rich.panel import Panel
 
 # Add parent directory to path
-sys.path.insert(0, '..')
+script_dir = os.path.dirname(os.path.abspath(__file__))
+parent_dir = os.path.dirname(script_dir)
+sys.path.insert(0, parent_dir)
scripts/setup_ollama.py (2)

40-76: Security consideration: Executing downloaded script from external source.

Lines 43-61 download and execute a shell script from https://ollama.com/install.sh. While this is the official Ollama installation method, it presents security risks:

  1. No integrity verification: The script is executed without checksum validation
  2. HTTPS trust: Relies entirely on HTTPS/TLS for authenticity
  3. Arbitrary code execution: Downloaded content is piped to sh -c

This is acceptable since:

  • It's the official Ollama installation method documented by Ollama
  • Users can skip with CORTEX_SKIP_OLLAMA_SETUP=1
  • Error handling prevents installation failures from blocking Cortex setup

However, consider documenting this security trade-off in the code comments.

Suggested documentation addition
     logger.info("📦 Installing Ollama for local LLM support...")
     logger.info("   This enables privacy-first, offline package management")
+    
+    # Security note: This downloads and executes the official Ollama install script.
+    # Users can skip this by setting CORTEX_SKIP_OLLAMA_SETUP=1 environment variable.
+    # See: https://github.com/ollama/ollama for installation verification.
     
     try:

100-102: Fixed sleep duration may be insufficient or excessive.

Line 101 uses time.sleep(2) after starting the Ollama service. This arbitrary 2-second wait may:

  • Be insufficient on slower systems (service not ready yet)
  • Be excessive on fast systems (unnecessary delay)
🔎 Proposed fix

Implement a retry loop with timeout instead of fixed sleep:

         # Give it a moment to start
-        time.sleep(2)
-        logger.info("✅ Ollama service started")
-        return True
+        # Poll for service readiness
+        max_attempts = 10
+        for attempt in range(max_attempts):
+            time.sleep(1)
+            try:
+                import requests
+                response = requests.get("http://localhost:11434/api/tags", timeout=2)
+                if response.status_code == 200:
+                    logger.info("✅ Ollama service started")
+                    return True
+            except Exception:
+                pass
+        
+        logger.warning("⚠️  Service started but not responding yet")
+        return True  # Don't fail - service may still be starting
cortex/llm_router.py (2)

388-390: Token estimation is imprecise.

The word-count-based estimation (len(content.split()) * 1.3) is a rough approximation. Consider using Ollama's actual token count from the response if available, or document this limitation.

🔎 Proposed improvement
-        # Ollama doesn't provide token counts in the same way
-        # Estimate based on response length
-        tokens_used = len(content.split()) * 1.3  # Rough estimate
+        # Use Ollama's token counts if available, otherwise estimate
+        eval_count = response_data.get("eval_count", 0)
+        prompt_eval_count = response_data.get("prompt_eval_count", 0)
+        if eval_count or prompt_eval_count:
+            tokens_used = eval_count + prompt_eval_count
+        else:
+            # Fallback: estimate based on response length
+            tokens_used = int(len(content.split()) * 1.3)

319-354: Fallback logic could recurse indefinitely in edge cases.

If a provider fails repeatedly and fallback keeps selecting the same failing provider chain, this could cause deep recursion. The current logic mitigates this by using force_provider, but consider adding a recursion guard.

🔎 Proposed improvement

Add a _fallback_depth parameter to prevent unbounded recursion:

     def complete(
         self,
         messages: list[dict[str, str]],
         task_type: TaskType = TaskType.USER_CHAT,
         force_provider: LLMProvider | None = None,
         temperature: float = 0.7,
         max_tokens: int = 4096,
         tools: list[dict] | None = None,
+        _fallback_depth: int = 0,
     ) -> LLMResponse:

Then guard before recursing:

                 if fallback_provider:
+                    if _fallback_depth >= 2:
+                        raise RuntimeError("Max fallback depth exceeded")
                     logger.info(f"🔄 Attempting fallback to {fallback_provider.value}")

                     return self.complete(
                         messages=messages,
                         task_type=task_type,
                         force_provider=fallback_provider,
                         temperature=temperature,
                         max_tokens=max_tokens,
                         tools=tools,
+                        _fallback_depth=_fallback_depth + 1,
                     )
cortex/providers/ollama_provider.py (2)

263-285: Model pull timeout may be insufficient for large models.

The pull_model() method uses self.timeout (default 300 seconds) for downloading models. Models like deepseek-coder-v2:16b (10GB+) may require significantly longer on slower connections.

🔎 Proposed fix
     def pull_model(self, model_name: str) -> bool:
         """
         Pull a model from Ollama registry.
         
         Args:
             model_name: Name of model to pull
             
         Returns:
             True if successful, False otherwise
         """
         logger.info(f"📥 Pulling model: {model_name}")
         
         try:
             response = requests.post(
                 f"{self.base_url}/api/pull",
                 json={"name": model_name},
                 stream=True,
-                timeout=self.timeout
+                timeout=None  # No timeout for model downloads
             )

Alternatively, add a separate pull_timeout parameter to __init__.


374-398: Prompt format may not be optimal for all models.

The simple System:/User:/Assistant: format works for basic models but may not be optimal for instruction-tuned models that expect specific prompt templates (e.g., Llama 3's <|begin_of_text|> format, Mistral's [INST] format). Consider using Ollama's native chat API (/api/chat) instead.

🔎 Proposed improvement
     def complete(
         self,
         messages: list[dict[str, str]],
         model: str | None = None,
         temperature: float = 0.7,
         max_tokens: int = 4096,
         stream: bool = False,
     ) -> dict[str, Any] | Generator[dict[str, Any], None, None]:
         # ... existing setup code ...
         
-        # Convert messages to Ollama format
-        prompt = self._messages_to_prompt(messages)
-
         try:
             response = requests.post(
-                f"{self.base_url}/api/generate",
+                f"{self.base_url}/api/chat",
                 json={
                     "model": model,
-                    "prompt": prompt,
-                    "temperature": temperature,
+                    "messages": messages,  # Native OpenAI-compatible format
                     "options": {
+                        "temperature": temperature,
                         "num_predict": max_tokens,
                     },
                     "stream": stream,
                 },

This uses Ollama's /api/chat endpoint which natively understands OpenAI-style message formats.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6dd6e0 and 1bed948.

📒 Files selected for processing (20)
  • CHANGELOG.md
  • README.md
  • cortex/first_run_wizard.py
  • cortex/kernel_features/hardware_detect.py
  • cortex/llm/interpreter.py
  • cortex/llm_router.py
  • cortex/providers/__init__.py
  • cortex/providers/ollama_provider.py
  • docs/OLLAMA_IMPLEMENTATION_SUMMARY.md
  • docs/OLLAMA_INTEGRATION.md
  • docs/OLLAMA_QUICKSTART.md
  • docs/PR_OLLAMA_INTEGRATION.md
  • examples/ollama_demo.py
  • examples/sample-config.yaml
  • scripts/setup_ollama.py
  • setup.py
  • test_output.txt
  • tests/test_cli.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/providers/__init__.py
  • tests/test_cli.py
  • cortex/kernel_features/hardware_detect.py
  • cortex/llm/interpreter.py
  • cortex/first_run_wizard.py
  • tests/test_llm_router.py
  • scripts/setup_ollama.py
  • examples/ollama_demo.py
  • cortex/llm_router.py
  • cortex/providers/ollama_provider.py
  • tests/test_ollama_integration.py
  • setup.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • cortex/providers/__init__.py
  • setup.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_cli.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
🧬 Code graph analysis (9)
cortex/providers/__init__.py (1)
cortex/providers/ollama_provider.py (1)
  • OllamaProvider (30-439)
tests/test_cli.py (1)
cortex/cli.py (1)
  • install (297-525)
cortex/llm/interpreter.py (1)
cortex/providers/ollama_provider.py (4)
  • OllamaProvider (30-439)
  • is_running (138-152)
  • start_service (154-191)
  • ensure_model_available (287-311)
cortex/first_run_wizard.py (2)
setup.py (2)
  • run (10-18)
  • run (23-31)
mcp/cortex_mcp_server.py (1)
  • run (240-244)
tests/test_llm_router.py (1)
cortex/llm_router.py (4)
  • route_task (208-268)
  • TaskType (35-45)
  • LLMProvider (48-53)
  • complete (270-354)
scripts/setup_ollama.py (3)
cortex/providers/ollama_provider.py (1)
  • install_ollama (88-136)
cortex/logging_system.py (2)
  • info (198-200)
  • warning (202-204)
setup.py (2)
  • run (10-18)
  • run (23-31)
examples/ollama_demo.py (2)
cortex/providers/ollama_provider.py (6)
  • is_installed (78-85)
  • is_running (138-152)
  • start_service (154-191)
  • get_available_models (193-216)
  • ensure_model_available (287-311)
  • complete (313-372)
cortex/llm_router.py (3)
  • TaskType (35-45)
  • complete (270-354)
  • route_task (208-268)
cortex/providers/ollama_provider.py (3)
scripts/setup_ollama.py (1)
  • install_ollama (26-76)
cortex/logging_system.py (2)
  • info (198-200)
  • warning (202-204)
cortex/llm_router.py (1)
  • complete (270-354)
tests/test_ollama_integration.py (2)
cortex/providers/ollama_provider.py (2)
  • is_installed (78-85)
  • complete (313-372)
cortex/llm_router.py (4)
  • LLMProvider (48-53)
  • TaskType (35-45)
  • route_task (208-268)
  • complete (270-354)
🪛 GitHub Actions: CI
cortex/llm/interpreter.py

[error] 148-148: ruff: W293 Blank line contains whitespace.

🪛 GitHub Check: lint
cortex/llm/interpreter.py

[failure] 175-175: Ruff (W293)
cortex/llm/interpreter.py:175:1: W293 Blank line contains whitespace


[failure] 166-166: Ruff (W293)
cortex/llm/interpreter.py:166:1: W293 Blank line contains whitespace


[failure] 160-160: Ruff (W293)
cortex/llm/interpreter.py:160:1: W293 Blank line contains whitespace


[failure] 156-156: Ruff (W293)
cortex/llm/interpreter.py:156:1: W293 Blank line contains whitespace


[failure] 151-151: Ruff (W293)
cortex/llm/interpreter.py:151:1: W293 Blank line contains whitespace


[failure] 148-148: Ruff (W293)
cortex/llm/interpreter.py:148:1: W293 Blank line contains whitespace


[failure] 221-221: Ruff (W293)
cortex/llm/interpreter.py:221:1: W293 Blank line contains whitespace


[failure] 218-218: Ruff (W293)
cortex/llm/interpreter.py:218:1: W293 Blank line contains whitespace


[failure] 210-210: Ruff (W293)
cortex/llm/interpreter.py:210:1: W293 Blank line contains whitespace


[failure] 207-207: Ruff (W293)
cortex/llm/interpreter.py:207:1: W293 Blank line contains whitespace

🪛 GitHub Check: Lint
cortex/llm/interpreter.py

[failure] 175-175: Ruff (W293)
cortex/llm/interpreter.py:175:1: W293 Blank line contains whitespace


[failure] 166-166: Ruff (W293)
cortex/llm/interpreter.py:166:1: W293 Blank line contains whitespace


[failure] 160-160: Ruff (W293)
cortex/llm/interpreter.py:160:1: W293 Blank line contains whitespace


[failure] 156-156: Ruff (W293)
cortex/llm/interpreter.py:156:1: W293 Blank line contains whitespace


[failure] 151-151: Ruff (W293)
cortex/llm/interpreter.py:151:1: W293 Blank line contains whitespace


[failure] 148-148: Ruff (W293)
cortex/llm/interpreter.py:148:1: W293 Blank line contains whitespace


[failure] 221-221: Ruff (W293)
cortex/llm/interpreter.py:221:1: W293 Blank line contains whitespace


[failure] 218-218: Ruff (W293)
cortex/llm/interpreter.py:218:1: W293 Blank line contains whitespace


[failure] 210-210: Ruff (W293)
cortex/llm/interpreter.py:210:1: W293 Blank line contains whitespace


[failure] 207-207: Ruff (W293)
cortex/llm/interpreter.py:207:1: W293 Blank line contains whitespace

🪛 LanguageTool
docs/OLLAMA_QUICKSTART.md

[style] ~135-~135: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 1384 characters long)
Context: ...come to privacy-first package management!** 🎉

(EN_EXCESSIVE_EXCLAMATION)

docs/OLLAMA_IMPLEMENTATION_SUMMARY.md

[style] ~3-~3: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...ntation Summary Date: December 24, 2025 Feature: Local LLM Support via Ol...

(MISSING_COMMA_AFTER_YEAR)

🪛 markdownlint-cli2 (0.18.1)
docs/OLLAMA_INTEGRATION.md

42-42: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


91-91: Bare URL used

(MD034, no-bare-urls)

docs/PR_OLLAMA_INTEGRATION.md

60-60: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


306-306: Bare URL used

(MD034, no-bare-urls)


307-307: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Agent
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
🔇 Additional comments (19)
examples/sample-config.yaml (1)

5-28: LGTM! Well-structured LLM configuration.

The new LLM configuration section is clear and comprehensive, providing good examples for both local Ollama usage and cloud provider fallbacks. The model preferences align with the code-focused optimization mentioned in the PR objectives.

tests/test_cli.py (1)

63-66: LGTM! Test correctly reflects Ollama fallback behavior.

The updated test expectation is correct—with Ollama integration, the CLI should succeed even without cloud API keys by falling back to local Ollama. The comment provides helpful context for this behavioral change.

docs/OLLAMA_QUICKSTART.md (1)

1-135: LGTM! Excellent quickstart guide.

The quickstart documentation is well-structured, clear, and appropriately progressive—starting with basic installation and gradually introducing advanced features. The examples are practical and the troubleshooting section addresses common issues effectively.

tests/test_llm_router.py (3)

38-88: LGTM! Routing tests appropriately updated for Ollama integration.

The test updates correctly reflect the new routing behavior where Ollama is the default local provider with smart fallbacks to cloud providers. The use of assertIn with multiple allowed providers makes the tests resilient to fallback scenarios while still validating correct routing behavior.


100-108: LGTM! Fallback test correctly handles Ollama unavailability.

The test appropriately verifies that when Ollama is unavailable, the system falls back to cloud providers (Claude or Kimi K2) as expected. This validates the smart fallback logic described in the PR objectives.


428-476: LGTM! End-to-end tests reflect Ollama-first routing.

The integration tests correctly validate the complete routing flow with Ollama as the preferred provider and appropriate fallbacks. The test expectations account for real-world scenarios where providers may be unavailable.

docs/OLLAMA_IMPLEMENTATION_SUMMARY.md (1)

1-377: LGTM! Comprehensive implementation documentation.

This implementation summary provides excellent technical depth and coverage of the Ollama integration. The document effectively serves as both a reference for developers and a guide for users, with clear sections on architecture, usage, configuration, and migration. The performance considerations and security sections are particularly valuable.

CHANGELOG.md (1)

11-50: LGTM!

The changelog entries accurately document the Ollama integration changes and follow the Keep a Changelog format. The additions cover features, changes, and security enhancements comprehensively.

cortex/llm/interpreter.py (1)

197-257: Enhanced JSON parsing logic looks robust.

The improved _parse_commands method properly handles:

  • Markdown code block extraction (lines 199-206)
  • JSON object/array detection (lines 211-220)
  • Matching bracket counting (lines 223-244)
  • Error logging with context (lines 254-257)

This significantly improves resilience to varied LLM response formats.

examples/ollama_demo.py (1)

30-67: LGTM! Well-structured demo with good error handling.

The demo functions effectively showcase:

  • Ollama status checking with clear feedback
  • Local completion with timing metrics
  • Router integration with fallback demonstration
  • Provider comparison table

Error handling is properly implemented with try-except blocks, and Rich console output provides a professional user experience.

Also applies to: 70-98, 100-135, 137-157

setup.py (1)

8-19: Post-install hooks gracefully handle failures.

Both PostInstallCommand and PostDevelopCommand properly:

  • Run the base class installation first
  • Execute the Ollama setup script with check=False (non-blocking)
  • Catch exceptions and provide user-friendly messages
  • Inform users about manual setup options

This ensures installation doesn't fail if Ollama setup encounters issues.

Also applies to: 21-31

scripts/setup_ollama.py (1)

211-256: LGTM! Proper environment detection and graceful degradation.

The setup_ollama() function properly:

  • Checks for skip flags (CORTEX_SKIP_OLLAMA_SETUP, CI, GITHUB_ACTIONS) to avoid running in automated environments
  • Detects interactive terminals with sys.stdin.isatty() to skip prompts in non-interactive mode
  • Provides clear user guidance on manual setup if automated setup fails
  • Handles failures gracefully without blocking Cortex installation
docs/OLLAMA_INTEGRATION.md (1)

707-773: The test expectations have already been updated to reflect Ollama-first routing—no further changes needed.

All tests mentioned in this review have been corrected to expect Ollama as the primary provider with fallback to other providers:

  • test_install_no_api_key: Already expects exit code 0 (success with Ollama)
  • test_complete_with_routing: Already accepts OLLAMA as a valid provider alongside fallbacks
  • test_fallback_on_error: Already accepts OLLAMA as a valid provider alongside fallbacks
  • test_complete_batch: Already updated to match current routing
  • test_is_running: Already fixed with proper ConnectionError handling
  • test_complete_with_ollama: Already returns proper dict structure

The LLMRouter class confirms all task types route to LLMProvider.OLLAMA as the primary provider by default.

README.md (2)

66-75: LGTM!

The updated feature table effectively communicates the privacy-first, offline-capable nature of the Ollama integration with clear icons and descriptions.


174-206: LGTM!

The Local LLM Support section is comprehensive, covering benefits, recommended models with sizes, management commands, and cloud fallback configuration. The documentation aligns well with the implementation.

cortex/llm_router.py (1)

148-167: LGTM!

The Ollama initialization with graceful fallback handling is well-structured. The service startup and model availability checks provide good resilience with informative logging.

cortex/providers/ollama_provider.py (2)

57-76: LGTM!

The constructor is well-designed with sensible defaults, environment variable support for OLLAMA_HOST, and clear parameter documentation. The caching mechanism for _available_models and _selected_model is appropriate.


193-249: LGTM!

The model discovery and selection logic is well-implemented with appropriate caching, preference ordering, and fallback handling. The logging provides good visibility into model selection decisions.

test_output.txt (1)

707-773: The test failures listed in the output do not reflect the current state of the codebase. The tests have been updated to expect Ollama as the primary routing provider with proper fallback to Claude/Kimi K2, exception handling in OllamaProvider.is_running() is implemented correctly, and test expectations for API key requirements and mock configurations are aligned with the implementation. No actionable issues remain—the review comment is based on an outdated snapshot of the tests.

Likely an incorrect or invalid review comment.

Comment on lines +375 to +383
print("\nPulling codellama:7b model (this may take a few minutes)...")
try:
subprocess.run(["ollama", "pull", "llama3.2"], check=True)
subprocess.run(["ollama", "pull", "codellama:7b"], check=True)
print("\n✓ Model ready!")
except subprocess.CalledProcessError:
print("\n⚠ Could not pull model - you can do this later with: ollama pull llama3.2")
print("\n⚠ Could not pull model - you can do this later with: ollama pull codellama:7b")

self.config["api_provider"] = "ollama"
self.config["ollama_model"] = "llama3.2"
self.config["ollama_model"] = "codellama:7b"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify default model references across the codebase

echo "=== Checking for codellama:7b references ==="
rg -n "codellama:7b" --type=py --type=md

echo ""
echo "=== Checking for phi3:mini references ==="
rg -n "phi3:mini" --type=py --type=md

echo ""
echo "=== Checking setup_ollama.py for default model ==="
find . -name "setup_ollama.py" -type f | head -5

echo ""
echo "=== Checking ollama_provider.py ==="
find . -name "ollama_provider.py" -type f | head -5

Repository: cortexlinux/cortex

Length of output: 3407


🏁 Script executed:

cat -n cortex/providers/ollama_provider.py | head -80

Repository: cortexlinux/cortex

Length of output: 2876


🏁 Script executed:

sed -n '115,160p' scripts/setup_ollama.py

Repository: cortexlinux/cortex

Length of output: 1591


Critical model inconsistency: first_run_wizard.py hardcodes codellama:7b, which is not in OllamaProvider's preferred or fallback models.

The first-run wizard unconditionally pulls codellama:7b (lines 375-383), but:

  1. OllamaProvider does not list codellama:7b in PREFERRED_MODELS or FALLBACK_MODELS—it lists phi3:mini, llama3:8b, and mistral:7b as fallbacks
  2. Documentation consistently references phi3:mini as the default (lightweight, 1.9GB)
  3. The setup script (setup_ollama.py) offers user choice with codellama:7b as default, contradicting the wizard's approach

This creates a critical integration issue: the wizard pulls a model the provider code doesn't recognize in its preference hierarchy, potentially causing unexpected behavior during model selection fallback logic.

Resolution: first_run_wizard.py should either pull from the provider's PREFERRED_MODELS list (starting with deepseek-coder-v2:16b, fallback to phi3:mini) or consistently use the documented phi3:mini default to align with OllamaProvider's expectations and documentation.

🤖 Prompt for AI Agents
In cortex/first_run_wizard.py around lines 375-383, the wizard currently
hardcodes pulling "codellama:7b" which conflicts with OllamaProvider's model
lists; change it to derive the model choice from OllamaProvider instead (or at
minimum use the documented lightweight default "phi3:mini"). Import or reference
OllamaProvider, pick model = OllamaProvider.PREFERRED_MODELS[0] if present else
OllamaProvider.FALLBACK_MODELS[0] (or default to "phi3:mini"), use that variable
in the subprocess.run pull command and in self.config["ollama_model"], and
update the user-facing print messages to reflect the chosen model string.

Comment on lines +1 to +5
"""Cortex Providers Package"""

from cortex.providers.ollama_provider import OllamaProvider

__all__ = ["OllamaProvider"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add module docstring per coding guidelines.

The file is missing a module-level docstring. Per the coding guidelines, "Docstrings required for all public APIs." This package initialization exposes the public OllamaProvider API and should include a docstring describing the providers package.

🔎 Proposed fix
+"""
+Cortex Providers Package
+
+This package provides LLM provider implementations for Cortex Linux.
+Currently includes:
+- OllamaProvider: Local LLM inference via Ollama
+"""
+
 """Cortex Providers Package"""
 
 from cortex.providers.ollama_provider import OllamaProvider

Note: The existing single-line docstring on line 1 is minimal. Consider replacing it with the more comprehensive version above.

As per coding guidelines, docstrings are required for all public APIs.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In cortex/providers/__init__.py around lines 1 to 5, the module-level docstring
is too minimal (just a single-line) and the file must include a clear module
docstring describing the providers package and the public API it exposes; update
the top of the file to replace or expand the current docstring with a brief
multi-line description of the package purpose, the fact that it exposes
OllamaProvider as the public provider, and any intended usage notes or stability
guarantees so it satisfies the coding guideline requiring docstrings for all
public APIs.

Comment on lines 100 to 136
try:
# Official Ollama installation script
result = subprocess.run(
["curl", "-fsSL", "https://ollama.com/install.sh"],
capture_output=True,
text=True,
timeout=60,
)

if result.returncode != 0:
logger.error(f"Failed to download Ollama installer: {result.stderr}")
return False

# Execute installation script
install_result = subprocess.run(
["sh", "-c", result.stdout],
capture_output=True,
text=True,
timeout=300,
)

if install_result.returncode == 0:
logger.info("✅ Ollama installed successfully")
# Start Ollama service
subprocess.run(["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True)
time.sleep(2) # Give service time to start
return True
else:
logger.error(f"Ollama installation failed: {install_result.stderr}")
return False

except Exception as e:
logger.error(f"Error installing Ollama: {e}")
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

find . -name "ollama_provider.py" -type f

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

./cortex/providers/ollama_provider.py

Script executed:

head -150 cortex/providers/ollama_provider.py | tail -60

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

        
        Returns:
            True if installation succeeded, False otherwise
        """
        if OllamaProvider.is_installed():
            logger.info("✅ Ollama already installed")
            return True

        logger.info("📦 Installing Ollama...")
        try:
            # Official Ollama installation script
            result = subprocess.run(
                ["curl", "-fsSL", "https://ollama.com/install.sh"],
                capture_output=True,
                text=True,
                timeout=60,
            )
            
            if result.returncode != 0:
                logger.error(f"Failed to download Ollama installer: {result.stderr}")
                return False

            # Execute installation script
            install_result = subprocess.run(
                ["sh", "-c", result.stdout],
                capture_output=True,
                text=True,
                timeout=300,
            )

            if install_result.returncode == 0:
                logger.info("✅ Ollama installed successfully")
                # Start Ollama service
                subprocess.run(["ollama", "serve"], 
                             stdout=subprocess.DEVNULL, 
                             stderr=subprocess.DEVNULL,
                             start_new_session=True)
                time.sleep(2)  # Give service time to start
                return True
            else:
                logger.error(f"Ollama installation failed: {install_result.stderr}")
                return False

        except Exception as e:
            logger.error(f"Error installing Ollama: {e}")
            return False

    def is_running(self) -> bool:
        """
        Check if Ollama service is running.
        
        Returns:
            True if service is accessible, False otherwise
        """
        try:
            response = requests.get(
                f"{self.base_url}/api/tags",
                timeout=5
            )
            return response.status_code == 200

Script executed:

wc -l cortex/providers/ollama_provider.py

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

439 cortex/providers/ollama_provider.py

Script executed:

sed -n '75,140p' cortex/providers/ollama_provider.py

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

        self._selected_model: str | None = None

    @staticmethod
    def is_installed() -> bool:
        """
        Check if Ollama is installed on the system.
        
        Returns:
            True if Ollama is available, False otherwise
        """
        return shutil.which("ollama") is not None

    @staticmethod
    def install_ollama() -> bool:
        """
        Install Ollama on the system.
        
        Returns:
            True if installation succeeded, False otherwise
        """
        if OllamaProvider.is_installed():
            logger.info("✅ Ollama already installed")
            return True

        logger.info("📦 Installing Ollama...")
        try:
            # Official Ollama installation script
            result = subprocess.run(
                ["curl", "-fsSL", "https://ollama.com/install.sh"],
                capture_output=True,
                text=True,
                timeout=60,
            )
            
            if result.returncode != 0:
                logger.error(f"Failed to download Ollama installer: {result.stderr}")
                return False

            # Execute installation script
            install_result = subprocess.run(
                ["sh", "-c", result.stdout],
                capture_output=True,
                text=True,
                timeout=300,
            )

            if install_result.returncode == 0:
                logger.info("✅ Ollama installed successfully")
                # Start Ollama service
                subprocess.run(["ollama", "serve"], 
                             stdout=subprocess.DEVNULL, 
                             stderr=subprocess.DEVNULL,
                             start_new_session=True)
                time.sleep(2)  # Give service time to start
                return True
            else:
                logger.error(f"Ollama installation failed: {install_result.stderr}")
                return False

        except Exception as e:
            logger.error(f"Error installing Ollama: {e}")
            return False

    def is_running(self) -> bool:
        """
        Check if Ollama service is running.

Script executed:

sed -n '1,30p' cortex/providers/ollama_provider.py

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

#!/usr/bin/env python3
"""
Ollama Provider for Cortex Linux
Enables local LLM support for privacy-first, offline package management.

Features:
- Auto-detect Ollama installation
- Smart model selection (prefers code-focused models)
- Streaming responses
- Zero data sent to cloud
- Fully offline capable

Author: Cortex Linux Team
License: Apache 2.0
"""

import json
import logging
import os
import shutil
import subprocess
import time
from typing import Any, Generator

import requests

logger = logging.getLogger(__name__)


class OllamaProvider:

Ollama install.sh checksum verification security


Short answer: Ollama's install.sh (curl ... | sh) does not itself verify checksums; the safer approach is to download the installer or package, inspect the script, and verify the package SHA256 against the published release checksums (Ollama publishes SHA256 hashes on its GitHub releases). Steps below.

Why: the install script downloads a tarball over HTTPS and extracts it without checksum verification, so piping it directly to sh risks executing a tampered script if your network or TLS is compromised. [1][2]

Quick secure workflow (Linux):

  1. Download the install script and inspect before running:

  2. Or manually download the package and verify SHA256 (recommended):

    • ARCH=amd64
    • curl -fsSL -o ollama.tgz "https://ollama.com/download/ollama-linux-${ARCH}.tgz"
    • sha256sum ollama.tgz
    • Compare the output to the SHA256 published in the Ollama GitHub release assets (each release lists sha256 for its artifacts). If they match, extract and install. [3]

Example compare source (GitHub releases show sha256 lines for each asset). [3]

Notes / recommendations:

  • Always verify the script contents before piping to sh; prefer manual download + checksum verification when possible. [1][2][3]
  • Use an authenticated channel (HTTPS + valid certs) and up-to-date TLS/OS packages.
  • If you need, I can fetch the latest release SHA256 for your platform and give exact commands to verify it now — tell me your OS/arch.

Sources:

  • Ollama install.sh source (downloaded script). [1]
  • Ollama docs/install page recommending curl | sh. [2]
  • Ollama GitHub releases (assets include sha256 hashes). [3]

Ollama official installation script recommended practices


Here are concise recommended practices when using Ollama’s official install script (install.sh):

  1. Inspect the script before running (don’t blindly pipe to sh). Download and review https://ollama.com/install.sh first. [1]
  2. Use the safe curl flags shown in docs (curl -fsSL) and run with sh: curl -fsSL https://ollama.com/install.sh | sh, or download then run locally. [2]
  3. Prefer downloading+verifying the script/binary before executing if you require supply-chain safety (save file, check contents, run from disk). [1][2]
  4. Run with appropriate privileges: the script needs sudo/root to install to system locations; run as root or ensure sudo is available. [1]
  5. Ensure required tools are present (curl, awk, grep, sed, tee, xargs) and supported architecture (amd64/arm64); the script checks these. [1]
  6. For WSL: use WSL2 (script will reject WSL1). [1]
  7. Use OLLAMA_VERSION environment variable if you need a specific release. [1]
  8. For restricted/offline networks, use the script’s offline/download hints or pre-download bundles and re-run (script has support / improvements for offline installs). [3]
  9. After install, verify binary and service (ollama --version, check systemd status) and follow Ollama docs for post-install configuration and security (firewall, run as non-root daemon where possible). [2]
  10. Keep the installation up to date via the official update mechanism or re-run the installer from the official source; follow official docs for upgrades. [2]

Sources: official install script and docs, plus install-script offline-improvement PR.
[1] Ollama install.sh (official).
[2] Ollama documentation / installation guide.
[3] Ollama GitHub PR improving install.sh offline behavior.


Script executed:

grep -n "def " cortex/providers/ollama_provider.py | head -20

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

57:    def __init__(
78:    def is_installed() -> bool:
88:    def install_ollama() -> bool:
138:    def is_running(self) -> bool:
154:    def start_service(self) -> bool:
193:    def get_available_models(self) -> list[str]:
218:    def select_best_model(self) -> str | None:
251:    def pull_model(self, model_name: str) -> bool:
287:    def ensure_model_available(self) -> str | None:
313:    def complete(
374:    def _messages_to_prompt(self, messages: list[dict[str, str]]) -> str:
400:    def _stream_response(self, response: requests.Response) -> Generator[dict[str, Any], None, None]:
418:    def get_model_info(self, model_name: str) -> dict[str, Any] | None:

Script executed:

sed -n '80,95p' cortex/providers/ollama_provider.py

Repository: cortexlinux/cortex


Repository: cortexlinux/cortex
Exit code: 0

stdout:

        Check if Ollama is installed on the system.
        
        Returns:
            True if Ollama is available, False otherwise
        """
        return shutil.which("ollama") is not None

    @staticmethod
    def install_ollama() -> bool:
        """
        Install Ollama on the system.
        
        Returns:
            True if installation succeeded, False otherwise
        """
        if OllamaProvider.is_installed():

Security concern: Install script executed without verification.

The install_ollama() method downloads and executes Ollama's remote installation script without inspection or verification. While this is the official installation method, Ollama's own documentation recommends downloading and inspecting the script first, or verifying the package SHA256 against published hashes on GitHub releases before execution. Current implementation skips these safeguards. Consider:

  1. Saving the script to a file and validating its contents before execution
  2. Using Ollama's package verification method (download tarball, verify SHA256 against GitHub releases)
  3. Documenting this security consideration in the method's docstring

Comment on lines 123 to 129
# Start Ollama service
subprocess.run(["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True)
time.sleep(2) # Give service time to start
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

subprocess.run with start_new_session doesn't background the process properly.

Using subprocess.run() (which blocks) with start_new_session=True doesn't achieve backgrounding. The code will still wait for ollama serve to complete, which it never does since it's a long-running service. This should use subprocess.Popen instead (as done correctly in start_service() on line 172).

🔎 Proposed fix
             if install_result.returncode == 0:
                 logger.info("✅ Ollama installed successfully")
                 # Start Ollama service
-                subprocess.run(["ollama", "serve"], 
-                             stdout=subprocess.DEVNULL, 
-                             stderr=subprocess.DEVNULL,
-                             start_new_session=True)
+                subprocess.Popen(
+                    ["ollama", "serve"],
+                    stdout=subprocess.DEVNULL,
+                    stderr=subprocess.DEVNULL,
+                    start_new_session=True,
+                )
                 time.sleep(2)  # Give service time to start
                 return True
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Start Ollama service
subprocess.run(["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True)
time.sleep(2) # Give service time to start
return True
# Start Ollama service
subprocess.Popen(
["ollama", "serve"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
start_new_session=True,
)
time.sleep(2) # Give service time to start
return True
🤖 Prompt for AI Agents
In cortex/providers/ollama_provider.py around lines 123 to 129, the code uses
subprocess.run([...], start_new_session=True) to start the long-running "ollama
serve" which blocks; replace that call with subprocess.Popen([...],
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, start_new_session=True) so
the service is started in the background without blocking (match the pattern
used in start_service() at line 172), then keep the existing time.sleep(2) and
return True (or store/close the Popen handle as done elsewhere if required).

]
else:
requirements = ["anthropic>=0.18.0", "openai>=1.0.0"]
requirements = ["anthropic>=0.18.0", "openai>=1.0.0", "requests>=2.32.4"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

latest anthropic Python package version PyPI 2025

💡 Result:

The latest anthropic PyPI release is version 0.75.0 (uploaded Nov 24, 2025). [1][2]

Sources:
[1] PyPI project page — anthropic.
[2] Safety DB package page — anthropic.


🌐 Web query:

latest openai Python package version PyPI 2025

💡 Result:

The latest OpenAI Python package on PyPI is version 2.14.0 (uploaded Dec 19, 2025). [1]

Reference:
[1] PyPI — openai project release (openai 2.14.0).


🌐 Web query:

latest requests Python package version PyPI 2025

💡 Result:

The latest requests release on PyPI is 2.32.5 (uploaded Aug 18–19, 2025). [1][2]

Sources:

  • PyPI project page for requests. [1]
  • Safety DB listing for requests (latest v2.32.5). [2]

🌐 Web query:

anthropic Python package security vulnerabilities CVE 2025

💡 Result:

Short answer: I could not find any CVEs in 2025 that specifically target the PyPI "anthropic" Python package. The only CVE listed for that package in public vulnerability databases is older (CVE‑2022‑23491). [1]

However, several 2025 CVEs affect Anthropic projects (mostly the claude-code tool and other Anthropic services). Notable 2025 entries:

  • CVE‑2025‑52882 — Improper authorization in @Anthropic‑ai/claude‑code (unauthorized websocket connections). [2]
  • CVE‑2025‑54795 / CVE‑2025‑55284 — Command‑injection / command execution issues in @Anthropic‑ai/claude‑code (multiple advisories/patches in 2025). [3][4]
  • CVE‑2025‑58764, CVE‑2025‑59041, CVE‑2025‑59829, CVE‑2025‑64755, CVE‑2025‑66032 — various command execution, arbitrary file write, symlink/permission bypass and related vulnerabilities reported against @Anthropic‑ai/claude‑code in 2025. [5][6][7]

Sources:

  • Anthropic (PyPI) vulnerability summary / SafetyDB (shows CVE‑2022‑23491 for anthropic package). [1]
  • GitHub advisory / NVD entry for Anthropic Slack MCP data exfiltration CVE‑2025‑34072. [2]
  • Snyk advisories for claude‑code (CVE‑2025‑52882, CVE‑2025‑54795, CVE‑2025‑55284). [3][4]
  • Vulert / vulnerability databases listing multiple 2025 CVEs for @Anthropic‑ai/claude‑code (CVE‑2025‑58764, CVE‑2025‑59041, CVE‑2025‑59829, CVE‑2025‑64755, CVE‑2025‑66032). [5][6][7]

If you want, I can:

  • fetch the specific CVE identifiers, publication dates and patched versions for each of the claude‑code CVEs; or
  • check whether a specific version of the anthropic Python package you use is affected.

🌐 Web query:

openai Python package security vulnerabilities CVE 2025

💡 Result:

Short answer: I found no CVE in 2025 for the official openai Python package on PyPI. Third‑party projects and plugins that reference "OpenAI" or "openai" have CVEs in 2025 (examples below). Sources:

  • No known vulnerabilities listed for the official "openai" PyPI package (vendor/scan page). [1]
  • CVE‑2025‑31843: Missing authorization in "OpenAI Tools for WordPress & WooCommerce" plugin. [2]
  • CVE‑2025‑61260: Command‑injection in OpenAI Codex CLI (tracked by PT Security/dbugs). [3]
  • PYSEC‑2025‑50 / CVE‑2025‑48887: ReDoS in vLLM's OpenAI tool parser (OSV advisory). [4]

If you want, I can:

  • run a deeper search for any newly assigned CVEs after Dec 24, 2025, or
  • monitor specific repos (openai on PyPI/GitHub) and send a timeline of any future CVE entries.

🌐 Web query:

requests Python package security vulnerabilities CVE 2025

💡 Result:

Short answer: Yes — several CVE-2025 entries affect the Python requests package (some are distro-packaging advisories listing CVE-2025 IDs; others are CPython vulnerabilities that can indirectly affect code using requests). Action: update requests to a patched release from your distro or to the latest upstream release, and also keep your Python interpreter patched.

Key references:

  • Vulnerability listing for python-requests (multiple CVE‑2025 IDs reported by Snyk). [1]
  • Example: ALAS / Amazon advisory for the related certificate-verification bug (CVE-2024-35195) fixed in requests 2.32.0 (shows how distro packages track/fix requests issues). [2]
  • CPython security issues in 2025 (e.g., CVE‑2025‑6069 HTMLParser quadratic‑complexity — can affect Python apps, including those using requests). [3]

If you want, tell me your environment (OS/distro, requests version, Python version) and I’ll list which of these CVEs are relevant to you and give exact upgrade/patch commands.


Update minimum dependency versions to more recent stable releases.

The specified minimum versions are outdated:

  • anthropic>=0.18.0 should be updated to at least 0.75.0 (current stable as of Nov 2025)
  • openai>=1.0.0 should be updated to at least 2.14.0 (current stable as of Dec 2025)
  • requests>=2.32.4 can be updated to 2.32.5 for the latest patch

While no active CVEs were found for these packages at the specified versions, using very old minimum versions (2+ years old for anthropic and openai) is risky for compatibility and future maintenance. Consider using more recent, well-tested stable releases or pinning exact versions for production stability.

🤖 Prompt for AI Agents
In setup.py around line 50, the requirements list uses outdated minimum
versions; update the entry to require newer stable minimums: set anthropic to
>=0.75.0, openai to >=2.14.0, and requests to >=2.32.5 (or pin exact versions if
you prefer reproducible installs), then run your dependency install/tests to
verify compatibility.

Comment on lines 41 to 58
@patch('cortex.providers.ollama_provider.requests.get')
def test_is_running(self, mock_get):
"""Test Ollama service detection."""
# Test when running
mock_response = Mock()
mock_response.status_code = 200
mock_get.return_value = mock_response
mock_get.side_effect = None # Clear any side effects

provider = OllamaProvider()
self.assertTrue(provider.is_running())

# Test when not running - use RequestException
from requests.exceptions import ConnectionError
mock_get.side_effect = ConnectionError("Connection refused")

provider2 = OllamaProvider()
self.assertFalse(provider2.is_running())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix test structure and mock configuration.

Several issues in this test:

  1. Import placement (line 54): ConnectionError is imported inside the test method. Move imports to the top of the file for consistency with PEP 8.
  2. Inconsistent exception (line 53 comment vs line 55 code): Comment mentions "RequestException" but code uses ConnectionError. Update comment for accuracy.
  3. Unnecessary instance creation (line 57): Creates a second provider instance provider2 instead of reusing the first. This creates confusion and doesn't test the same instance behavior.
  4. Mock side_effect issue: Setting side_effect directly causes the actual test failure seen in test_output.txt. The mock should be configured to handle both cases properly.
🔎 Proposed fix

Move import to top of file:

from requests.exceptions import ConnectionError

Then update the test:

+from requests.exceptions import ConnectionError
+
 class TestOllamaProvider(unittest.TestCase):
     """Test Ollama provider functionality."""
 
     # ... other tests ...
 
     @patch('cortex.providers.ollama_provider.requests.get')
     def test_is_running(self, mock_get):
         """Test Ollama service detection."""
         # Test when running
         mock_response = Mock()
         mock_response.status_code = 200
         mock_get.return_value = mock_response
-        mock_get.side_effect = None  # Clear any side effects
         
         provider = OllamaProvider()
         self.assertTrue(provider.is_running())
         
-        # Test when not running - use RequestException
-        from requests.exceptions import ConnectionError
+        # Test when not running - use ConnectionError
         mock_get.side_effect = ConnectionError("Connection refused")
-        
-        provider2 = OllamaProvider()
-        self.assertFalse(provider2.is_running())
+        self.assertFalse(provider.is_running())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@patch('cortex.providers.ollama_provider.requests.get')
def test_is_running(self, mock_get):
"""Test Ollama service detection."""
# Test when running
mock_response = Mock()
mock_response.status_code = 200
mock_get.return_value = mock_response
mock_get.side_effect = None # Clear any side effects
provider = OllamaProvider()
self.assertTrue(provider.is_running())
# Test when not running - use RequestException
from requests.exceptions import ConnectionError
mock_get.side_effect = ConnectionError("Connection refused")
provider2 = OllamaProvider()
self.assertFalse(provider2.is_running())
@patch('cortex.providers.ollama_provider.requests.get')
def test_is_running(self, mock_get):
"""Test Ollama service detection."""
# Test when running
mock_response = Mock()
mock_response.status_code = 200
mock_get.return_value = mock_response
provider = OllamaProvider()
self.assertTrue(provider.is_running())
# Test when not running - use ConnectionError
mock_get.side_effect = ConnectionError("Connection refused")
self.assertFalse(provider.is_running())
🤖 Prompt for AI Agents
In tests/test_ollama_integration.py around lines 41-58, fix the test structure
and mock configuration: move "from requests.exceptions import ConnectionError"
to the top-level imports, update the comment to accurately say "ConnectionError"
instead of "RequestException", reuse the same provider instance instead of
creating provider2, and configure the mock correctly by clearing side_effect and
setting mock_get.return_value to a Mock(status_code=200) for the first
assertion, then set mock_get.side_effect = ConnectionError("Connection refused")
for the second assertion (or temporarily set mock_get.return_value = None and
side_effect accordingly) so the same provider.is_running() call is exercised for
both running and not-running cases without leaving side_effect set incorrectly.

…sts to enhance Ollama integration and error handling
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
tests/test_llm_router.py (1)

1-782: Run Black formatter to fix code style.

The CI pipeline indicates this file needs Black formatting. Run black tests/test_llm_router.py to address the formatting issues.

setup.py (1)

1-90: Run Black formatter to fix code style.

The CI pipeline indicates this file needs Black formatting. Run black setup.py to address the formatting issues.

cortex/llm_router.py (1)

1-1028: Run Black formatter to fix code style.

The CI pipeline indicates this file needs Black formatting. Run black cortex/llm_router.py to address the formatting issues.

♻️ Duplicate comments (4)
tests/test_ollama_integration.py (2)

41-58: Fix test structure and mock configuration.

The issues identified in past reviews remain valid:

  1. Move from requests.exceptions import ConnectionError to top-level imports (line 54)
  2. Reuse the same provider instance instead of creating provider2 (line 57)
  3. The mock configuration with side_effect needs proper handling for both test cases

163-190: Fix mock configuration for router integration test.

The past review correctly identifies that the mock setup doesn't match how the router actually uses the Ollama provider. The response will contain a MagicMock instead of the expected string content, causing test failures.

scripts/setup_ollama.py (1)

203-205: Fix missing f-string prefix.

Line 204 is missing the f prefix for the f-string, so {model_name} won't be interpolated.

🔎 Proposed fix
     except subprocess.TimeoutExpired:
-        logger.warning("⚠️  Model download timed out - try again with: ollama pull {model_name}")
+        logger.warning(f"⚠️  Model download timed out - try again with: ollama pull {model_name}")
         return False
cortex/providers/ollama_provider.py (1)

122-130: Critical: Use Popen instead of subprocess.run for background service.

Line 125-128 uses subprocess.run() which blocks indefinitely waiting for ollama serve to complete. This should use subprocess.Popen() to start the service in the background, matching the pattern used in start_service() at line 173.

🔎 Proposed fix
             if install_result.returncode == 0:
                 logger.info("✅ Ollama installed successfully")
                 # Start Ollama service
-                subprocess.run(["ollama", "serve"],
-                             stdout=subprocess.DEVNULL,
-                             stderr=subprocess.DEVNULL,
-                             start_new_session=True)
+                subprocess.Popen(
+                    ["ollama", "serve"],
+                    stdout=subprocess.DEVNULL,
+                    stderr=subprocess.DEVNULL,
+                    start_new_session=True,
+                )
                 time.sleep(2)  # Give service time to start
                 return True
🧹 Nitpick comments (4)
setup.py (2)

22-32: Consider skipping Ollama setup in development mode.

Running the Ollama setup script on every develop install could be disruptive during iterative development. Consider checking for a skip environment variable or detecting virtual environments to avoid repeated setup attempts.

🔎 Proposed improvement
 class PostDevelopCommand(develop):
     """Post-development setup for Ollama."""
     def run(self):
         develop.run(self)
+        # Skip in development if already configured
+        if os.getenv("CORTEX_SKIP_OLLAMA_SETUP"):
+            print("ℹ️  Skipping Ollama setup (CORTEX_SKIP_OLLAMA_SETUP is set)")
+            return
         # Run Ollama setup after development install
         try:

51-51: Consider updating minimum dependency versions.

The fallback requirements specify anthropic>=0.18.0 and openai>=1.0.0, which are quite old (2+ years). While no active CVEs were found, using more recent stable versions (e.g., anthropic>=0.75.0, openai>=2.14.0) would improve compatibility and maintainability. However, this is marked as "Chill" review, so this is optional unless you're concerned about long-term maintenance.

scripts/setup_ollama.py (1)

37-76: Consider adding security verification for installation script.

The function downloads and executes Ollama's installation script without verification. While this matches Ollama's official documentation, consider adding a note about security or optionally supporting SHA256 verification of the downloaded package for enhanced security posture.

cortex/providers/ollama_provider.py (1)

100-137: Security consideration: Installation script executed without verification.

The install_ollama() method downloads and directly executes Ollama's installation script without checksum verification. While this follows Ollama's official docs, consider documenting this security consideration in the method's docstring or adding optional SHA256 verification against published release checksums for enhanced security posture.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1bed948 and 4679715.

📒 Files selected for processing (10)
  • cortex/llm/interpreter.py
  • cortex/llm_router.py
  • cortex/providers/ollama_provider.py
  • examples/ollama_demo.py
  • scripts/setup_ollama.py
  • setup.py
  • tests/integration/test_end_to_end.py
  • tests/test_cli.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • cortex/llm/interpreter.py
  • tests/test_cli.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/providers/ollama_provider.py
  • scripts/setup_ollama.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
  • cortex/llm_router.py
  • examples/ollama_demo.py
  • tests/integration/test_end_to_end.py
  • setup.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
  • tests/integration/test_end_to_end.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • setup.py
🧬 Code graph analysis (6)
cortex/providers/ollama_provider.py (2)
scripts/setup_ollama.py (1)
  • install_ollama (26-76)
cortex/llm_router.py (1)
  • complete (270-354)
scripts/setup_ollama.py (3)
cortex/providers/ollama_provider.py (1)
  • install_ollama (89-137)
cortex/logging_system.py (2)
  • info (198-200)
  • warning (202-204)
setup.py (2)
  • run (11-19)
  • run (24-32)
tests/test_llm_router.py (1)
cortex/llm_router.py (3)
  • route_task (208-268)
  • TaskType (35-45)
  • LLMProvider (48-53)
tests/test_ollama_integration.py (3)
cortex/llm_router.py (4)
  • LLMProvider (48-53)
  • TaskType (35-45)
  • route_task (208-268)
  • complete (270-354)
cortex/providers/ollama_provider.py (9)
  • OllamaProvider (31-440)
  • is_installed (79-86)
  • is_running (139-153)
  • get_available_models (194-217)
  • select_best_model (219-250)
  • pull_model (252-286)
  • ensure_model_available (288-312)
  • complete (314-373)
  • install_ollama (89-137)
scripts/setup_ollama.py (2)
  • setup_ollama (211-255)
  • install_ollama (26-76)
cortex/llm_router.py (1)
cortex/providers/ollama_provider.py (6)
  • OllamaProvider (31-440)
  • is_installed (79-86)
  • is_running (139-153)
  • start_service (155-192)
  • ensure_model_available (288-312)
  • complete (314-373)
examples/ollama_demo.py (2)
cortex/llm_router.py (3)
  • TaskType (35-45)
  • complete (270-354)
  • route_task (208-268)
cortex/providers/ollama_provider.py (6)
  • is_installed (79-86)
  • is_running (139-153)
  • start_service (155-192)
  • get_available_models (194-217)
  • ensure_model_available (288-312)
  • complete (314-373)
🪛 GitHub Actions: CI
cortex/providers/ollama_provider.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

scripts/setup_ollama.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

tests/test_llm_router.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

tests/test_ollama_integration.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

cortex/llm_router.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

examples/ollama_demo.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

tests/integration/test_end_to_end.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

setup.py

[error] 1-1: Black formatting check failed. 9 files would be reformatted by Black; run 'black --write' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)

Comment on lines +31 to +68
def check_ollama_status():
"""Check Ollama installation and available models."""
console.print("\n[bold cyan]🔍 Checking Ollama Status[/bold cyan]")

provider = OllamaProvider()

# Check installation
if provider.is_installed():
console.print("✅ Ollama installed", style="green")
else:
console.print("❌ Ollama not installed", style="red")
console.print("\nInstall with: curl -fsSL https://ollama.com/install.sh | sh")
return False

# Check service
if provider.is_running():
console.print("✅ Ollama service running", style="green")
else:
console.print("⚠️ Ollama service not running", style="yellow")
console.print("Starting service...")
if provider.start_service():
console.print("✅ Service started", style="green")
else:
console.print("❌ Failed to start service", style="red")
return False

# List models
models = provider.get_available_models()
if models:
console.print("\n[bold]Available Models:[/bold]")
for model in models:
console.print(f" • {model}", style="cyan")
else:
console.print("\n⚠️ No models installed", style="yellow")
console.print("Install default model: ollama pull phi3:mini")
return False

return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hints and enhance docstrings.

The function check_ollama_status() lacks type hints and has a minimal docstring. Per coding guidelines, type hints are required and docstrings should describe parameters and return values.

🔎 Proposed fix
-def check_ollama_status():
-    """Check Ollama installation and available models."""
+def check_ollama_status() -> bool:
+    """
+    Check Ollama installation and available models.
+    
+    Returns:
+        bool: True if Ollama is ready, False otherwise
+    """
🤖 Prompt for AI Agents
In examples/ollama_demo.py around lines 31 to 68, the check_ollama_status
function is missing type hints and has a minimal docstring; update the function
signature to include return type hint -> bool and add a local type for provider
(provider: OllamaProvider) if OllamaProvider is importable, and expand the
docstring to a short one-line summary plus a "Returns" section describing the
boolean meaning (True = Ollama installed, running, and has models; False = any
check failed). Keep the implementation unchanged except for annotations and the
enhanced docstring.

Comment on lines 71 to 98
def demo_local_completion():
"""Demonstrate local LLM completion."""
console.print("\n[bold cyan]💬 Testing Local Completion[/bold cyan]")

provider = OllamaProvider()

# Ensure model available
model = provider.ensure_model_available()
if not model:
console.print("❌ No model available", style="red")
return

console.print(f"Using model: [cyan]{model}[/cyan]")

# Test completion
messages = [
{"role": "user", "content": "How do I install nginx on Ubuntu? Be brief."}
]

console.print("\n[yellow]Generating response...[/yellow]")
start_time = time.time()

response = provider.complete(messages=messages, temperature=0.7, max_tokens=200)

elapsed = time.time() - start_time

console.print(f"\n[bold]Response ({elapsed:.2f}s):[/bold]")
console.print(Panel(response.get("response", "No response"), style="green"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hints and enhance docstrings.

The function demo_local_completion() lacks type hints and a detailed docstring.

🔎 Proposed fix
-def demo_local_completion():
-    """Demonstrate local LLM completion."""
+def demo_local_completion() -> None:
+    """
+    Demonstrate local LLM completion using Ollama.
+    
+    Shows model selection and response generation with timing.
+    """
🤖 Prompt for AI Agents
In examples/ollama_demo.py around lines 71 to 98, add type hints and a richer
docstring to demo_local_completion(): annotate the function as returning None
(-> None), type the local variables/parameters used (e.g., messages:
list[dict[str, str]] or from typing import List, Dict), and type the provider
and response where helpful (e.g., response: dict[str, Any]). Replace the short
docstring with a multi-line docstring that explains the function’s purpose, side
effects (prints to console), inputs used (none), and return value (None). If you
introduce typing names (List, Dict, Any), add the necessary imports at the top
of the file.

Comment on lines 101 to 136
def demo_router_with_fallback():
"""Demonstrate LLM router with fallback."""
console.print("\n[bold cyan]🧭 Testing LLM Router[/bold cyan]")

router = LLMRouter(prefer_local=True)

# Test routing decision
routing = router.route_task(TaskType.SYSTEM_OPERATION)
console.print(f"\nRouting decision: [cyan]{routing.provider.value}[/cyan]")
console.print(f"Reasoning: {routing.reasoning}")

# Test completion
messages = [
{"role": "user", "content": "List 3 lightweight text editors for Ubuntu"}
]

console.print("\n[yellow]Generating response...[/yellow]")
start_time = time.time()

try:
response = router.complete(
messages=messages,
task_type=TaskType.SYSTEM_OPERATION,
temperature=0.7,
max_tokens=200
)

elapsed = time.time() - start_time

console.print(f"\n[bold]Response from {response.provider.value} ({elapsed:.2f}s):[/bold]")
console.print(Panel(response.content, style="green"))
console.print(f"\nCost: ${response.cost_usd:.4f} | Tokens: {response.tokens_used}")

except Exception as e:
console.print(f"❌ Error: {e}", style="red")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hints and enhance docstrings.

The function demo_router_with_fallback() lacks type hints and a detailed docstring.

🔎 Proposed fix
-def demo_router_with_fallback():
-    """Demonstrate LLM router with fallback."""
+def demo_router_with_fallback() -> None:
+    """
+    Demonstrate LLM router with fallback behavior.
+    
+    Shows routing decisions and completion with cost/token tracking.
+    """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def demo_router_with_fallback():
"""Demonstrate LLM router with fallback."""
console.print("\n[bold cyan]🧭 Testing LLM Router[/bold cyan]")
router = LLMRouter(prefer_local=True)
# Test routing decision
routing = router.route_task(TaskType.SYSTEM_OPERATION)
console.print(f"\nRouting decision: [cyan]{routing.provider.value}[/cyan]")
console.print(f"Reasoning: {routing.reasoning}")
# Test completion
messages = [
{"role": "user", "content": "List 3 lightweight text editors for Ubuntu"}
]
console.print("\n[yellow]Generating response...[/yellow]")
start_time = time.time()
try:
response = router.complete(
messages=messages,
task_type=TaskType.SYSTEM_OPERATION,
temperature=0.7,
max_tokens=200
)
elapsed = time.time() - start_time
console.print(f"\n[bold]Response from {response.provider.value} ({elapsed:.2f}s):[/bold]")
console.print(Panel(response.content, style="green"))
console.print(f"\nCost: ${response.cost_usd:.4f} | Tokens: {response.tokens_used}")
except Exception as e:
console.print(f"❌ Error: {e}", style="red")
def demo_router_with_fallback() -> None:
"""
Demonstrate LLM router with fallback behavior.
Shows routing decisions and completion with cost/token tracking.
"""
console.print("\n[bold cyan]🧭 Testing LLM Router[/bold cyan]")
router = LLMRouter(prefer_local=True)
# Test routing decision
routing = router.route_task(TaskType.SYSTEM_OPERATION)
console.print(f"\nRouting decision: [cyan]{routing.provider.value}[/cyan]")
console.print(f"Reasoning: {routing.reasoning}")
# Test completion
messages = [
{"role": "user", "content": "List 3 lightweight text editors for Ubuntu"}
]
console.print("\n[yellow]Generating response...[/yellow]")
start_time = time.time()
try:
response = router.complete(
messages=messages,
task_type=TaskType.SYSTEM_OPERATION,
temperature=0.7,
max_tokens=200
)
elapsed = time.time() - start_time
console.print(f"\n[bold]Response from {response.provider.value} ({elapsed:.2f}s):[/bold]")
console.print(Panel(response.content, style="green"))
console.print(f"\nCost: ${response.cost_usd:.4f} | Tokens: {response.tokens_used}")
except Exception as e:
console.print(f"❌ Error: {e}", style="red")
🤖 Prompt for AI Agents
In examples/ollama_demo.py around lines 101 to 136, the function
demo_router_with_fallback() lacks type annotations and a descriptive docstring;
update the function to include a clear docstring describing purpose, behavior,
return value, and exceptions, and add type hints to the signature (e.g., ->
None) and to key local variables (router: LLMRouter, messages: list[dict[str,
str]], response: <appropriate response type or Any>) as well as the raised
exception handling, ensuring required types are imported or referenced via
typing.Any if concrete types are unavailable.

Comment on lines +138 to +157
def show_provider_comparison():
"""Show comparison between providers."""
console.print("\n[bold cyan]📊 Provider Comparison[/bold cyan]\n")

table = Table(title="LLM Provider Comparison")

table.add_column("Feature", style="cyan")
table.add_column("Ollama (Local)", style="green")
table.add_column("Claude", style="yellow")
table.add_column("OpenAI", style="blue")

table.add_row("Privacy", "100% Local ✅", "Cloud", "Cloud")
table.add_row("Cost", "$0", "$3-15/1M tokens", "$2-30/1M tokens")
table.add_row("Offline", "Yes ✅", "No", "No")
table.add_row("API Key", "Not needed ✅", "Required", "Required")
table.add_row("Speed", "Varies by HW", "Fast", "Fast")
table.add_row("Quality", "Good-Excellent", "Excellent", "Excellent")
table.add_row("Setup", "Auto ✅", "Manual", "Manual")

console.print(table)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hints and enhance docstrings.

The function show_provider_comparison() lacks type hints and a detailed docstring.

🔎 Proposed fix
-def show_provider_comparison():
-    """Show comparison between providers."""
+def show_provider_comparison() -> None:
+    """
+    Display a comparison table of LLM providers.
+    
+    Compares Ollama, Claude, and OpenAI across multiple dimensions.
+    """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def show_provider_comparison():
"""Show comparison between providers."""
console.print("\n[bold cyan]📊 Provider Comparison[/bold cyan]\n")
table = Table(title="LLM Provider Comparison")
table.add_column("Feature", style="cyan")
table.add_column("Ollama (Local)", style="green")
table.add_column("Claude", style="yellow")
table.add_column("OpenAI", style="blue")
table.add_row("Privacy", "100% Local ✅", "Cloud", "Cloud")
table.add_row("Cost", "$0", "$3-15/1M tokens", "$2-30/1M tokens")
table.add_row("Offline", "Yes ✅", "No", "No")
table.add_row("API Key", "Not needed ✅", "Required", "Required")
table.add_row("Speed", "Varies by HW", "Fast", "Fast")
table.add_row("Quality", "Good-Excellent", "Excellent", "Excellent")
table.add_row("Setup", "Auto ✅", "Manual", "Manual")
console.print(table)
def show_provider_comparison() -> None:
"""
Display a comparison table of LLM providers.
Compares Ollama, Claude, and OpenAI across multiple dimensions.
"""
console.print("\n[bold cyan]📊 Provider Comparison[/bold cyan]\n")
table = Table(title="LLM Provider Comparison")
table.add_column("Feature", style="cyan")
table.add_column("Ollama (Local)", style="green")
table.add_column("Claude", style="yellow")
table.add_column("OpenAI", style="blue")
table.add_row("Privacy", "100% Local ✅", "Cloud", "Cloud")
table.add_row("Cost", "$0", "$3-15/1M tokens", "$2-30/1M tokens")
table.add_row("Offline", "Yes ✅", "No", "No")
table.add_row("API Key", "Not needed ✅", "Required", "Required")
table.add_row("Speed", "Varies by HW", "Fast", "Fast")
table.add_row("Quality", "Good-Excellent", "Excellent", "Excellent")
table.add_row("Setup", "Auto ✅", "Manual", "Manual")
console.print(table)
🤖 Prompt for AI Agents
In examples/ollama_demo.py around lines 138 to 157, the function
show_provider_comparison() lacks type hints and has a minimal docstring; update
it to include a clear, descriptive docstring (what the function does, any side
effects like printing to console) and add Python type hints (e.g., def
show_provider_comparison() -> None), and if the function uses or relies on
module-level objects like console or Table, mention them in the docstring; keep
behavior unchanged and ensure the docstring follows the project's style
(one-line summary + optional longer description).

… and enhance test assertions across multiple files
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cortex/llm_router.py (3)

591-596: Async path does not support Ollama routing.

The acomplete() method only handles CLAUDE and KIMI_K2 providers in the try block (lines 592-596). When routing.provider == LLMProvider.OLLAMA, it falls through to the else branch and calls _acomplete_kimi, which is incorrect behavior.

🔎 Proposed fix
         try:
-            if routing.provider == LLMProvider.CLAUDE:
+            if routing.provider == LLMProvider.OLLAMA:
+                # For now, run sync Ollama in thread pool
+                # TODO: Implement native async Ollama support
+                import asyncio
+                loop = asyncio.get_event_loop()
+                response = await loop.run_in_executor(
+                    None,
+                    lambda: self._complete_ollama(messages, temperature, max_tokens)
+                )
+            elif routing.provider == LLMProvider.CLAUDE:
                 response = await self._acomplete_claude(messages, temperature, max_tokens, tools)
             else:  # KIMI_K2
                 response = await self._acomplete_kimi(messages, temperature, max_tokens, tools)

608-614: Async fallback logic doesn't include Ollama.

The async fallback only toggles between Claude and Kimi K2, missing Ollama from the fallback chain. This is inconsistent with the sync complete() fallback behavior.

🔎 Proposed fix to align with sync fallback logic
             if self.enable_fallback:
-                fallback_provider = (
-                    LLMProvider.KIMI_K2
-                    if routing.provider == LLMProvider.CLAUDE
-                    else LLMProvider.CLAUDE
-                )
+                # Smart fallback priority matching sync path
+                if routing.provider == LLMProvider.OLLAMA:
+                    fallback_provider = (
+                        LLMProvider.CLAUDE
+                        if self.claude_client_async
+                        else LLMProvider.KIMI_K2 if self.kimi_client_async else None
+                    )
+                elif routing.provider == LLMProvider.CLAUDE:
+                    fallback_provider = (
+                        LLMProvider.KIMI_K2 if self.kimi_client_async else None
+                    )
+                else:  # KIMI_K2
+                    fallback_provider = (
+                        LLMProvider.CLAUDE if self.claude_client_async else None
+                    )
+                
+                if fallback_provider is None:
+                    raise RuntimeError("No fallback provider available")
+                    
                 logger.info(f"🔄 Attempting fallback to {fallback_provider.value}")

521-544: get_stats() doesn't include Ollama statistics.

The get_stats() method returns statistics for Claude and Kimi K2 but omits Ollama, even though provider_stats tracks Ollama (line 203). This inconsistency means Ollama usage won't be visible in stats.

🔎 Proposed fix
         with self._stats_lock:
             return {
                 "total_requests": self.request_count,
                 "total_cost_usd": round(self.total_cost_usd, 4),
                 "providers": {
+                    "ollama": {
+                        "requests": self.provider_stats[LLMProvider.OLLAMA]["requests"],
+                        "tokens": self.provider_stats[LLMProvider.OLLAMA]["tokens"],
+                        "cost_usd": round(self.provider_stats[LLMProvider.OLLAMA]["cost"], 4),
+                    },
                     "claude": {
                         "requests": self.provider_stats[LLMProvider.CLAUDE]["requests"],
                         "tokens": self.provider_stats[LLMProvider.CLAUDE]["tokens"],
                         "cost_usd": round(self.provider_stats[LLMProvider.CLAUDE]["cost"], 4),
                     },
                     "kimi_k2": {
                         "requests": self.provider_stats[LLMProvider.KIMI_K2]["requests"],
                         "tokens": self.provider_stats[LLMProvider.KIMI_K2]["tokens"],
                         "cost_usd": round(self.provider_stats[LLMProvider.KIMI_K2]["cost"], 4),
                     },
                 },
             }
♻️ Duplicate comments (2)
scripts/setup_ollama.py (1)

203-205: Missing f-string prefix in timeout error message.

The error message on line 204 uses {model_name} but lacks the f prefix, so the variable won't be interpolated.

🔎 Proposed fix
     except subprocess.TimeoutExpired:
-        logger.warning("⚠️  Model download timed out - try again with: ollama pull {model_name}")
+        logger.warning(f"⚠️  Model download timed out - try again with: ollama pull {model_name}")
         return False
cortex/providers/ollama_provider.py (1)

125-132: subprocess.run blocks indefinitely for long-running service.

The call to subprocess.run(["ollama", "serve"], ...) will block because ollama serve is a long-running daemon. Even with start_new_session=True, run() waits for completion. This should use subprocess.Popen like start_service() does at line 172.

🔎 Proposed fix
             if install_result.returncode == 0:
                 logger.info("✅ Ollama installed successfully")
                 # Start Ollama service
-                subprocess.run(
+                subprocess.Popen(
                     ["ollama", "serve"],
                     stdout=subprocess.DEVNULL,
                     stderr=subprocess.DEVNULL,
                     start_new_session=True,
                 )
                 time.sleep(2)  # Give service time to start
                 return True
🧹 Nitpick comments (5)
scripts/setup_ollama.py (3)

40-61: Security: Remote script execution without verification.

The pattern of downloading and executing a remote script (curl | sh) is inherently risky. While this is Ollama's official installation method, consider adding a note in the docstring about this security consideration. The script could be tampered with if the network/TLS is compromised.

🔎 Suggested docstring update
 def install_ollama() -> bool:
     """
     Install Ollama using the official installation script.
 
+    Note: This downloads and executes a remote script. For enhanced security,
+    consider manually downloading and inspecting the script, or verifying the
+    package SHA256 against Ollama's GitHub releases.
+
     Returns:
         True if installation succeeded, False otherwise
     """

93-107: Service startup uses fixed sleep without health check.

start_ollama_service() uses a fixed 2-second sleep after starting the service. On slower systems, this may not be enough time; on faster systems, it wastes time. Consider implementing a retry loop with health checks similar to the pattern used in OllamaProvider.start_service().

🔎 Proposed improvement
     try:
         # Start Ollama in background
         subprocess.Popen(
             ["ollama", "serve"],
             stdout=subprocess.DEVNULL,
             stderr=subprocess.DEVNULL,
             start_new_session=True,
         )
 
-        # Give it a moment to start
-        time.sleep(2)
-        logger.info("✅ Ollama service started")
-        return True
+        # Wait for service to become ready
+        import requests
+        for _ in range(10):
+            time.sleep(1)
+            try:
+                resp = requests.get("http://localhost:11434/api/tags", timeout=2)
+                if resp.status_code == 200:
+                    logger.info("✅ Ollama service started")
+                    return True
+            except Exception:
+                pass
+        logger.warning("⚠️  Ollama service may not be fully ready")
+        return True
 
     except Exception as e:
         logger.warning(f"⚠️  Failed to start Ollama service: {e}")
         return False

130-132: Tuple unpacking issue with is_default variable.

The tuple unpacking (name, size, desc, *is_default) captures is_default as a list (empty or containing True). The check if is_default will always be truthy for the first tuple (since [True] is truthy) but the intent seems to check if the model is marked as default. This works but is slightly confusing.

🔎 Clearer alternative
-    for i, (name, size, desc, *is_default) in enumerate(models, 1):
-        default_marker = " ⭐" if is_default else ""
+    for i, model in enumerate(models, 1):
+        name, size, desc = model[:3]
+        is_default = len(model) > 3 and model[3]
+        default_marker = " ⭐" if is_default else ""
cortex/llm_router.py (1)

388-390: Token estimation is a rough heuristic.

The token count estimation len(content.split()) * 1.3 is a very rough approximation. Consider documenting this limitation or using Ollama's actual token count if available in the response.

cortex/providers/ollama_provider.py (1)

371-395: Message conversion could be improved for multi-turn conversations.

The _messages_to_prompt method produces a simple concatenated format. For models that support native chat, consider using Ollama's /api/chat endpoint instead of /api/generate for better multi-turn handling.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4679715 and b35b693.

📒 Files selected for processing (9)
  • cortex/llm/interpreter.py
  • cortex/llm_router.py
  • cortex/providers/ollama_provider.py
  • examples/ollama_demo.py
  • scripts/setup_ollama.py
  • setup.py
  • tests/integration/test_end_to_end.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/test_ollama_integration.py
  • cortex/llm/interpreter.py
  • setup.py
  • tests/integration/test_end_to_end.py
  • examples/ollama_demo.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • scripts/setup_ollama.py
  • tests/test_llm_router.py
  • cortex/llm_router.py
  • cortex/providers/ollama_provider.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_llm_router.py
🔇 Additional comments (18)
scripts/setup_ollama.py (3)

1-18: Module structure and imports look good.

The imports are well-organized, and the logging setup at module level is appropriate for a setup script.


21-23: LGTM!

Clean implementation using shutil.which for checking binary availability.


211-261: Main orchestration function looks well-structured.

The setup_ollama() function properly handles CI environments, skip flags, non-interactive mode, and provides clear user guidance. Good defensive programming.

tests/test_llm_router.py (5)

38-44: Test properly updated for Ollama-first routing.

The assertion assertIn(decision.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE]) correctly reflects the new routing behavior where Ollama is preferred but Claude is an acceptable fallback.


112-122: Fallback test updated appropriately.

The test now correctly expects any of the three providers when Ollama is unavailable, reflecting the intelligent fallback chain.


419-462: End-to-end test properly mocks Ollama provider.

The mock setup for OllamaProvider is well-structured, correctly simulating both the is_running() and complete() methods to test the Ollama-first routing path.


464-510: Fallback test correctly verifies Claude fallback after Ollama failure.

The test properly sets up Ollama to fail, then Kimi to fail, expecting Claude to succeed as the final fallback. This validates the complete fallback chain.


676-682: Batch test assertions correctly broadened.

The assertions now accept multiple providers for batch processing results, which aligns with the dynamic routing behavior based on provider availability.

cortex/llm_router.py (4)

51-53: Ollama enum member added correctly.

The new OLLAMA provider is properly added to the enum with a descriptive comment.


94-106: Cost structure properly includes Ollama with zero cost.

Good design decision to include Ollama in the cost tracking structure with zero costs since it runs locally.


149-167: Robust Ollama initialization with graceful degradation.

The initialization properly handles installation checks, service startup, and model availability, with appropriate fallback to cloud providers when Ollama is unavailable.


356-400: Ollama completion method implemented correctly.

The _complete_ollama method properly handles the Ollama API response format and provides a consistent LLMResponse structure.

cortex/providers/ollama_provider.py (6)

1-28: Well-structured module with clear documentation.

The module docstring clearly outlines the features, and imports are properly organized. Good use of type hints from collections.abc.


39-54: Model preference lists are well-designed.

The PREFERRED_MODELS list prioritizes code-focused models, with sensible FALLBACK_MODELS as backup. This aligns with the stated goal of code/system task optimization.


148-152: Good exception handling for service check.

Using requests.RequestException as the catch-all for network errors is the correct approach, addressing the earlier concern about overly broad exception handling.


170-187: Service startup with proper retry loop.

The start_service() method correctly uses Popen for the background process and implements a retry loop to check for service readiness. This is the correct pattern.


260-282: Model pull implements streaming progress correctly.

The pull_model method properly handles the streaming response to show progress updates. Good user experience for long-running downloads.


310-369: Complete method is well-implemented.

The complete() method properly ensures service availability, handles model selection, and supports both streaming and non-streaming modes. Error handling is appropriate.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/test_end_to_end.py (1)

124-135: Consider retaining exit code validation as a safety net.

The string-based assertions are reasonable for handling pip warnings, but dropping the result.succeeded() check means process-level failures (e.g., Docker issues, pip failures before tests run) won't be caught explicitly. The "passed" check should catch most issues, but adding exit code validation provides defense in depth.

Also, the comment on line 124 mentions "ignoring pip warnings" but the logic actually combines streams to find "passed" rather than filtering warnings.

🔎 Suggested improvement
         # Check that tests passed, ignoring pip warnings in stderr
         combined_output = f"{result.stdout}\n{result.stderr}"
+        # Ensure the process itself didn't crash (exit code check)
+        self.assertTrue(
+            result.succeeded() or "passed" in combined_output.lower(),
+            msg=f"Process failed with no test output.\nStdout: {result.stdout}\nStderr: {result.stderr}",
+        )
         self.assertIn(
             "passed",
             combined_output.lower(),
             msg=f"Tests did not pass.\nStdout: {result.stdout}\nStderr: {result.stderr}",
         )

Alternatively, if the current behavior is intentional, update the comment to reflect what the code actually does:

-        # Check that tests passed, ignoring pip warnings in stderr
+        # Check combined stdout/stderr for pytest results (allows warnings in stderr)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b35b693 and 561c356.

📒 Files selected for processing (1)
  • tests/integration/test_end_to_end.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/integration/test_end_to_end.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/integration/test_end_to_end.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build Package
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.12)
🔇 Additional comments (1)
tests/integration/test_end_to_end.py (1)

118-118: Reasonable exclusion for Docker environment.

Ignoring test_ollama_integration.py is appropriate since the Ollama service won't be available in the Docker test container. This aligns with the PR's commit message about ignoring this test file in CI/Docker.

The Ollama integration tests work fine in regular CI but fail when run
inside the nested Docker container test environment. This is because:

1. The container environment has limited mocking capabilities
2. The fake API keys in CI (test-key-for-ci) can cause issues with real client initialization
3. These tests are still run in the main CI pipeline, just not in this nested container test

This only affects test_project_tests_run_inside_container, not the main test suite.
Use regex to specifically look for pytest's 'X failed' pattern instead of
just checking if 'failed' appears anywhere in output. This prevents false
positives from warnings that contain the word 'failed' but aren't about
test failures.
Replace unbounded quantifiers (\d+ and \s+) with:
- Bounded digit match: \d{1,5} (max 99999 test failures)
- Single space instead of \s+ (pytest always uses single space)

This eliminates any backtracking concerns while maintaining functionality.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/test_end_to_end.py (1)

133-133: Consider moving import re to the top of the file.

Per PEP 8, imports should be at the top of the file. While localizing the import works, placing it with the other imports at the top maintains consistency and improves readability.

🔎 Proposed fix

Add at the top of the file (around line 7):

import re

Then remove the local import at line 133.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 561c356 and 498bb94.

📒 Files selected for processing (1)
  • tests/integration/test_end_to_end.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/integration/test_end_to_end.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/integration/test_end_to_end.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build Package
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
🔇 Additional comments (3)
tests/integration/test_end_to_end.py (3)

20-21: LGTM!

The --root-user-action=ignore and --no-cache-dir flags are appropriate for ephemeral Docker container environments—they suppress unnecessary warnings and avoid wasting space on pip cache.


118-118: LGTM!

Excluding tests/test_ollama_integration.py is appropriate since nested Docker containers lack the Ollama service, and running those tests would produce false failures.


135-138: LGTM!

The bounded regex pattern (\d{1,5}) is a solid mitigation against ReDoS. The two-step verification—checking for "passed" and asserting zero failures—provides robust test outcome validation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/test_cli.py (1)

61-70: Verify Ollama-specific integration details in the test.

The test mocks the entire CommandInterpreter class, which bypasses verification that Ollama is actually being used as the provider. Consider adding assertions to verify:

  1. The interpreter was initialized with the correct provider ("ollama")
  2. The API key fallback to "ollama-local" occurs when no cloud API keys are set
  3. The _get_provider() method returns "ollama" when CORTEX_PROVIDER is set
🔎 Proposed enhanced test
 @patch.dict(os.environ, {"CORTEX_PROVIDER": "ollama"}, clear=True)
 @patch("cortex.cli.CommandInterpreter")
 def test_install_no_api_key(self, mock_interpreter_class):
     # With Ollama integration, no API key is needed - should succeed
     mock_interpreter = Mock()
     mock_interpreter.parse.return_value = ["apt update", "apt install docker"]
     mock_interpreter_class.return_value = mock_interpreter

     result = self.cli.install("docker", dry_run=True)
     self.assertEqual(result, 0)
+    
+    # Verify Ollama provider is used
+    mock_interpreter_class.assert_called_once()
+    call_args = mock_interpreter_class.call_args
+    self.assertEqual(call_args[0][1], "ollama")  # provider argument
+    self.assertEqual(call_args[0][0], "ollama-local")  # api_key argument
tests/test_llm_router.py (1)

678-684: Mock provider availability for deterministic batch testing.

The batch test should mock Ollama availability to ensure deterministic behavior. While the comment acknowledges that "providers may be different based on availability," this creates non-determinism in the test suite.

🔎 Recommended approach
+@patch("cortex.llm_router.OllamaProvider")
 @patch("cortex.llm_router.AsyncAnthropic")
 @patch("cortex.llm_router.AsyncOpenAI")
-def test_complete_batch(self, mock_async_openai, mock_async_anthropic):
+def test_complete_batch(self, mock_async_openai, mock_async_anthropic, mock_ollama_class):
     """Test batch completion with multiple requests."""
+    # Mock Ollama availability
+    mock_ollama = Mock()
+    mock_ollama.is_running.return_value = True
+    mock_ollama.complete.return_value = {
+        "response": "Ollama batch response",
+        "model": "codellama:latest"
+    }
+    mock_ollama_class.return_value = mock_ollama
+    
     # Mock responses
     mock_claude_response = Mock()
     # ... rest of mocking ...
     
     router = LLMRouter(claude_api_key="test-claude", kimi_api_key="test-kimi")
+    router.ollama_client = mock_ollama
     router.claude_client_async = mock_claude_client
     router.kimi_client_async = mock_kimi_client

     async def run_test():
         requests = [
             # ... requests ...
         ]

         responses = await router.complete_batch(requests, max_concurrent=2)
         self.assertEqual(len(responses), 2)
-        # With Ollama integration, providers may be different based on availability
-        self.assertIn(
-            responses[0].provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE, LLMProvider.KIMI_K2]
-        )
-        self.assertIn(
-            responses[1].provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE, LLMProvider.KIMI_K2]
-        )
+        # With Ollama mocked as available, both should use OLLAMA
+        self.assertEqual(responses[0].provider, LLMProvider.OLLAMA)
+        self.assertEqual(responses[1].provider, LLMProvider.OLLAMA)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 498bb94 and 362b43d.

📒 Files selected for processing (2)
  • tests/test_cli.py
  • tests/test_llm_router.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/test_llm_router.py
  • tests/test_cli.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_llm_router.py
  • tests/test_cli.py
🧬 Code graph analysis (2)
tests/test_llm_router.py (2)
cortex/llm_router.py (3)
  • route_task (208-268)
  • TaskType (35-45)
  • LLMProvider (48-53)
cortex/providers/ollama_provider.py (1)
  • is_running (141-152)
tests/test_cli.py (3)
cortex/llm/interpreter.py (1)
  • parse (275-336)
cortex/packages.py (1)
  • parse (383-427)
cortex/cli.py (1)
  • install (332-580)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: Test (Python 3.10)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.11)
🔇 Additional comments (2)
tests/test_llm_router.py (2)

421-465: Well-structured Ollama integration test.

This test demonstrates good testing practices:

  • Explicitly mocks OllamaProvider.is_running() and complete() methods
  • Verifies the expected provider (OLLAMA) is used
  • Validates response content contains expected information
  • Properly isolates the Ollama integration from external dependencies

This approach should be applied to the routing tests (lines 38-100) for consistency and determinism.


469-513: Excellent fallback cascade test.

This test properly validates the fallback chain when providers fail:

  1. Ollama fails with exception
  2. Kimi K2 fails with exception
  3. Claude succeeds and returns the fallback response

The test correctly verifies both the final provider used and the response content. This demonstrates proper error handling and fallback behavior.

Comment on lines +38 to 100
def test_user_chat_routes_to_ollama(self):
"""User chat tasks should route to Ollama by default."""
decision = self.router.route_task(TaskType.USER_CHAT)
self.assertEqual(decision.provider, LLMProvider.CLAUDE)
# With Ollama integration, defaults to Ollama, but falls back to Claude if unavailable
self.assertIn(decision.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE])
self.assertEqual(decision.task_type, TaskType.USER_CHAT)
self.assertGreater(decision.confidence, 0.9)

def test_system_operation_routes_to_kimi(self):
"""System operations should route to Kimi K2."""
def test_system_operation_routes_to_ollama(self):
"""System operations should route to Ollama by default."""
decision = self.router.route_task(TaskType.SYSTEM_OPERATION)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)
self.assertEqual(decision.task_type, TaskType.SYSTEM_OPERATION)

def test_error_debugging_routes_to_kimi(self):
"""Error debugging should route to Kimi K2."""
def test_error_debugging_routes_to_ollama(self):
"""Error debugging should route to Ollama by default."""
decision = self.router.route_task(TaskType.ERROR_DEBUGGING)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)

def test_requirement_parsing_routes_to_claude(self):
"""Requirement parsing should route to Claude."""
def test_requirement_parsing_routes_to_ollama(self):
"""Requirement parsing should route to Ollama by default."""
decision = self.router.route_task(TaskType.REQUIREMENT_PARSING)
self.assertEqual(decision.provider, LLMProvider.CLAUDE)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(decision.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE])

def test_code_generation_routes_to_kimi(self):
"""Code generation should route to Kimi K2."""
def test_code_generation_routes_to_ollama(self):
"""Code generation should route to Ollama by default."""
decision = self.router.route_task(TaskType.CODE_GENERATION)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)

def test_dependency_resolution_routes_to_kimi(self):
"""Dependency resolution should route to Kimi K2."""
def test_dependency_resolution_routes_to_ollama(self):
"""Dependency resolution should route to Ollama by default."""
decision = self.router.route_task(TaskType.DEPENDENCY_RESOLUTION)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)

def test_configuration_routes_to_kimi(self):
"""Configuration tasks should route to Kimi K2."""
def test_configuration_routes_to_ollama(self):
"""Configuration tasks should route to Ollama by default."""
decision = self.router.route_task(TaskType.CONFIGURATION)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)

def test_tool_execution_routes_to_kimi(self):
"""Tool execution should route to Kimi K2."""
def test_tool_execution_routes_to_ollama(self):
"""Tool execution should route to Ollama by default."""
decision = self.router.route_task(TaskType.TOOL_EXECUTION)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
# With Ollama integration, defaults to Ollama, but falls back if unavailable
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.KIMI_K2, LLMProvider.CLAUDE]
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Mock Ollama availability to prevent non-deterministic test behavior.

These routing tests have been updated to accept Ollama as a possible provider, but they don't explicitly mock Ollama's availability. This creates non-deterministic behavior:

  1. If Ollama is running on the test machine, tests route to OLLAMA
  2. If Ollama is not running, tests fall back to CLAUDE or KIMI_K2

The overly permissive assertions (assertIn(decision.provider, [LLMProvider.OLLAMA, ...])) mask this non-determinism but reduce test value by not verifying the specific routing logic.

🔎 Recommended approach: Split into separate test scenarios

Create explicit test cases for each scenario:

@patch("cortex.llm_router.OllamaProvider")
def test_user_chat_routes_to_ollama_when_available(self, mock_ollama_class):
    """User chat routes to Ollama when Ollama is available."""
    mock_ollama = Mock()
    mock_ollama.is_running.return_value = True
    mock_ollama_class.return_value = mock_ollama
    
    self.router.ollama_client = mock_ollama
    decision = self.router.route_task(TaskType.USER_CHAT)
    
    self.assertEqual(decision.provider, LLMProvider.OLLAMA)
    self.assertEqual(decision.task_type, TaskType.USER_CHAT)

@patch("cortex.llm_router.OllamaProvider")
def test_user_chat_falls_back_to_claude_when_ollama_unavailable(self, mock_ollama_class):
    """User chat falls back to Claude when Ollama is unavailable."""
    mock_ollama = Mock()
    mock_ollama.is_running.return_value = False
    mock_ollama_class.return_value = mock_ollama
    
    self.router.ollama_client = mock_ollama
    decision = self.router.route_task(TaskType.USER_CHAT)
    
    self.assertEqual(decision.provider, LLMProvider.CLAUDE)

Apply this pattern to all eight routing tests (lines 38-100).

🤖 Prompt for AI Agents
In tests/test_llm_router.py around lines 38-100 the routing tests accept Ollama
or fallbacks but do not mock Ollama availability, causing non-deterministic
behavior; split each existing test into two deterministic scenarios (Ollama
available and Ollama unavailable) by patching cortex.llm_router.OllamaProvider,
creating a mock_ollama with is_running returning True for the "available" test
and False for the "unavailable" test, assign self.router.ollama_client =
mock_ollama (or set the provider lookup to use the mock) and assert the exact
expected provider and task_type in each case (e.g., expect LLMProvider.OLLAMA
when True, expect the specific fallback like LLMProvider.CLAUDE or KIMI_K2 when
False).

Comment on lines +112 to 123
def test_fallback_when_ollama_unavailable(self):
"""Should fallback to cloud providers if Ollama unavailable."""
router = LLMRouter(
claude_api_key=None,
kimi_api_key="test-kimi-key",
enable_fallback=True, # No Claude
claude_api_key="test-claude-key", kimi_api_key="test-kimi-key", enable_fallback=True
)

# User chat normally goes to Claude, should fallback to Kimi
# If Ollama unavailable, should fallback to cloud providers
decision = router.route_task(TaskType.USER_CHAT)
self.assertEqual(decision.provider, LLMProvider.KIMI_K2)
self.assertIn(
decision.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE, LLMProvider.KIMI_K2]
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Mock Ollama as unavailable to match test intent.

The test name test_fallback_when_ollama_unavailable clearly indicates that Ollama should be unavailable, but the test doesn't enforce this condition. Without mocking OllamaProvider.is_running() to return False, the test behavior depends on whether Ollama is actually running.

Additionally, the assertion includes LLMProvider.OLLAMA in the accepted list, which contradicts the test's intent to verify fallback behavior when Ollama is unavailable.

🔎 Proposed fix
 @patch.dict(os.environ, {}, clear=True)
-def test_fallback_when_ollama_unavailable(self):
+@patch("cortex.llm_router.OllamaProvider")
+def test_fallback_when_ollama_unavailable(self, mock_ollama_class):
     """Should fallback to cloud providers if Ollama unavailable."""
+    # Mock Ollama as unavailable
+    mock_ollama = Mock()
+    mock_ollama.is_running.return_value = False
+    mock_ollama_class.return_value = mock_ollama
+    
     router = LLMRouter(
         claude_api_key="test-claude-key", kimi_api_key="test-kimi-key", enable_fallback=True
     )
+    router.ollama_client = mock_ollama

-    # If Ollama unavailable, should fallback to cloud providers
+    # Should fallback to cloud providers when Ollama unavailable
     decision = router.route_task(TaskType.USER_CHAT)
-    self.assertIn(
-        decision.provider, [LLMProvider.OLLAMA, LLMProvider.CLAUDE, LLMProvider.KIMI_K2]
-    )
+    # Should not route to Ollama since it's unavailable
+    self.assertIn(decision.provider, [LLMProvider.CLAUDE, LLMProvider.KIMI_K2])
+    self.assertNotEqual(decision.provider, LLMProvider.OLLAMA)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tests/test_llm_router.py around lines 112 to 123, the test named
test_fallback_when_ollama_unavailable does not mock OllamaProvider.is_running()
to simulate Ollama being down and still allows LLMProvider.OLLAMA in the
expected results; update the test to patch/mock OllamaProvider.is_running() (or
the router’s check) to return False so Ollama is treated as unavailable, then
change the assertion to only accept cloud providers (remove LLMProvider.OLLAMA)
as valid decisions; keep enable_fallback=True and ensure the mock is applied for
the duration of the test.

@Suyashd999
Copy link
Collaborator

@sujay-d07 the ollama prompt doesnt pops up please check

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cortex/llm_router.py (2)

569-592: Missing Ollama stats in get_stats() return value.

Ollama usage is tracked in provider_stats[LLMProvider.OLLAMA] (line 210), but get_stats() only returns Claude and Kimi K2 stats. Add Ollama stats to the returned dictionary.

🔎 Proposed fix
         with self._stats_lock:
             return {
                 "total_requests": self.request_count,
                 "total_cost_usd": round(self.total_cost_usd, 4),
                 "providers": {
+                    "ollama": {
+                        "requests": self.provider_stats[LLMProvider.OLLAMA]["requests"],
+                        "tokens": self.provider_stats[LLMProvider.OLLAMA]["tokens"],
+                        "cost_usd": round(self.provider_stats[LLMProvider.OLLAMA]["cost"], 4),
+                    },
                     "claude": {
                         "requests": self.provider_stats[LLMProvider.CLAUDE]["requests"],

639-643: Async completion doesn't support Ollama provider.

The acomplete() method only handles CLAUDE and KIMI_K2 (lines 640-643). If routing selects OLLAMA for an async call, it will incorrectly fall through to the Kimi handler.

Either add async Ollama support or explicitly check and reject Ollama in async context:

if routing.provider == LLMProvider.OLLAMA:
    raise RuntimeError("Async completion not supported for Ollama. Use complete() instead.")
♻️ Duplicate comments (4)
tests/test_llm_router.py (2)

38-100: Non-deterministic test behavior due to missing Ollama availability mocks.

These routing tests accept multiple providers (OLLAMA, CLAUDE, KIMI_K2) as valid outcomes, but don't explicitly control Ollama availability. Test behavior depends on whether Ollama is running on the test machine, reducing test value.

Per the past review comment, consider splitting into deterministic scenarios by mocking OllamaProvider.is_running() to return True or False explicitly.


111-122: Test contradicts its intent by accepting OLLAMA in the assertion.

The test test_fallback_when_ollama_unavailable should verify fallback behavior when Ollama is unavailable, but the assertion at line 121 still includes LLMProvider.OLLAMA as valid. Mock OllamaProvider.is_running() to return False and remove OLLAMA from the accepted providers.

tests/test_ollama_integration.py (1)

41-59: Test structure issues: import placement and instance reuse.

Per the past review comment:

  1. Move ConnectionError import to module level
  2. Comment says "RequestException" but code uses ConnectionError - update comment
  3. Reuse provider instead of creating provider2
setup.py (1)

124-124: Update minimum dependency versions.

Per the past review comment, the minimum versions are significantly outdated:

  • anthropic>=0.18.0 → consider >=0.75.0
  • openai>=1.0.0 → consider >=2.14.0
  • requests>=2.32.4 → consider >=2.32.5

Using older minimums may cause compatibility issues with newer API features.

🧹 Nitpick comments (8)
scripts/setup_ollama.py (1)

23-26: Add type hint for return type consistency.

While the function has a return type hint, consider adding explicit -> bool for consistency with other functions in the file.

cortex/_post_install.py (1)

30-44: Consider more specific exception handling and import validation.

The broad except Exception catches all errors, which may hide unexpected issues. Additionally, the dynamic import assumes scripts.setup_ollama is available, which may not be true if the package wasn't installed correctly.

🔎 Proposed improvements
     try:
         # Import and run the setup function
+        try:
+            from scripts.setup_ollama import setup_ollama
+        except ImportError as ie:
+            print(f"⚠️  Cannot import setup_ollama: {ie}")
+            print("ℹ️  The scripts package may not be installed correctly")
+            print("ℹ️  Try reinstalling: pip install -e .")
+            return
-        from scripts.setup_ollama import setup_ollama

         setup_ollama()

         # Create marker file to prevent running again
         os.makedirs(os.path.dirname(marker_file), exist_ok=True)
         with open(marker_file, "w") as f:
             f.write("Setup completed\n")

-    except Exception as e:
+    except KeyboardInterrupt:
+        print("\n⚠️  Setup cancelled by user")
+        print("ℹ️  You can run it manually later with: cortex-setup-ollama")
+    except OSError as e:
+        print(f"⚠️  Filesystem error during setup: {e}")
+        print("ℹ️  You can run it manually later with: cortex-setup-ollama")
+    except Exception as e:
         print(f"⚠️  Ollama setup encountered an issue: {e}")
         print("ℹ️  You can run it manually later with: cortex-setup-ollama")

More specific exception handling helps with debugging and provides clearer error messages to users.

scripts/verify_ollama_setup.sh (1)

12-16: Remove unused YELLOW variable.

The YELLOW variable is defined but never used in the script. Either remove it or use it for warning messages to maintain clean code.

 # Colors
 GREEN='\033[0;32m'
 RED='\033[0;31m'
-YELLOW='\033[1;33m'
 NC='\033[0m' # No Color
tests/test_ollama_integration.py (1)

18-18: Remove unused MagicMock import.

MagicMock is imported but not used in the test file. Only Mock is used.

-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch
setup.py (1)

10-54: Extract common post-install logic to reduce duplication.

PostInstallCommand and PostDevelopCommand have nearly identical run() implementations. Consider extracting the common logic into a helper function.

🔎 Proposed refactor
def _run_ollama_setup():
    """Common post-install Ollama setup logic."""
    print("\n" + "=" * 70)
    print("🚀 Running Cortex post-installation setup...")
    print("=" * 70 + "\n")
    try:
        from scripts.setup_ollama import setup_ollama
        setup_ollama()
    except Exception as e:
        print(f"⚠️  Ollama setup encountered an issue: {e}")
        print("ℹ️  You can run it manually later with: cortex-setup-ollama")
    finally:
        print("\n" + "=" * 70)
        print("💡 TIP: If Ollama setup didn't run, execute: cortex-setup-ollama")
        print("=" * 70)


class PostInstallCommand(install):
    def run(self):
        install.run(self)
        _run_ollama_setup()


class PostDevelopCommand(develop):
    def run(self):
        develop.run(self)
        _run_ollama_setup()
tests/test_ollama_setup_integration.py (2)

7-11: Remove unused imports.

shutil and tempfile are imported but never used in the file.

 import os
-import shutil
 import sys
-import tempfile
 from pathlib import Path

18-92: Add type hints to test functions.

Per the coding guidelines, type hints are required for Python code. Add return type annotations to the test functions.

def test_setup_import() -> bool:
def test_setup_execution() -> bool:
def test_package_structure() -> bool:
def test_manifest_includes() -> bool:
def main() -> int:
cortex/llm_router.py (1)

436-448: Use Ollama's actual token counts instead of word-count estimation.

The current estimation (len(content.split()) * 1.3) is imprecise. Ollama's /api/generate response includes eval_count (output tokens) and prompt_eval_count (input tokens) fields that provide accurate counts.

🔎 Proposed improvement
         content = response_data.get("response", "")
         model = response_data.get("model", "unknown")

-        # Ollama doesn't provide token counts in the same way
-        # Estimate based on response length
-        tokens_used = len(content.split()) * 1.3  # Rough estimate
+        # Use actual token counts from Ollama response
+        eval_count = response_data.get("eval_count", 0)
+        prompt_eval_count = response_data.get("prompt_eval_count", 0)
+        tokens_used = eval_count + prompt_eval_count
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 362b43d and c4d35f1.

📒 Files selected for processing (17)
  • AGENTS.md
  • MANIFEST.in
  • README.md
  • cortex/_post_install.py
  • cortex/llm/interpreter.py
  • cortex/llm_router.py
  • docs/AUTOMATIC_OLLAMA_SETUP.md
  • docs/OLLAMA_AUTO_SETUP_IMPLEMENTATION.md
  • docs/OLLAMA_AUTO_SETUP_QUICKREF.md
  • pyproject.toml
  • scripts/__init__.py
  • scripts/setup_ollama.py
  • scripts/verify_ollama_setup.sh
  • setup.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
  • tests/test_ollama_setup_integration.py
✅ Files skipped from review due to trivial changes (2)
  • scripts/init.py
  • MANIFEST.in
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/test_ollama_setup_integration.py
  • tests/test_llm_router.py
  • cortex/llm/interpreter.py
  • cortex/llm_router.py
  • scripts/setup_ollama.py
  • tests/test_ollama_integration.py
  • setup.py
  • cortex/_post_install.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_ollama_setup_integration.py
  • tests/test_llm_router.py
  • tests/test_ollama_integration.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • setup.py
  • pyproject.toml
**/*install*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations

Files:

  • cortex/_post_install.py
🧬 Code graph analysis (8)
tests/test_ollama_setup_integration.py (1)
scripts/setup_ollama.py (1)
  • setup_ollama (358-424)
tests/test_llm_router.py (1)
cortex/providers/ollama_provider.py (3)
  • is_running (141-152)
  • is_installed (79-86)
  • select_best_model (215-246)
cortex/llm/interpreter.py (1)
cortex/providers/ollama_provider.py (4)
  • is_running (141-152)
  • start_service (154-191)
  • select_best_model (215-246)
  • complete (310-369)
cortex/llm_router.py (1)
cortex/providers/ollama_provider.py (6)
  • OllamaProvider (31-436)
  • is_installed (79-86)
  • is_running (141-152)
  • start_service (154-191)
  • select_best_model (215-246)
  • complete (310-369)
scripts/setup_ollama.py (2)
cortex/providers/ollama_provider.py (1)
  • install_ollama (89-139)
cortex/logging_system.py (1)
  • info (200-202)
tests/test_ollama_integration.py (3)
cortex/llm_router.py (5)
  • LLMProvider (48-53)
  • LLMRouter (79-860)
  • TaskType (35-45)
  • route_task (231-291)
  • complete (293-402)
cortex/providers/ollama_provider.py (7)
  • OllamaProvider (31-436)
  • is_installed (79-86)
  • get_available_models (193-213)
  • select_best_model (215-246)
  • pull_model (248-282)
  • complete (310-369)
  • install_ollama (89-139)
scripts/setup_ollama.py (2)
  • setup_ollama (358-424)
  • install_ollama (28-108)
setup.py (1)
scripts/setup_ollama.py (1)
  • setup_ollama (358-424)
cortex/_post_install.py (1)
scripts/setup_ollama.py (1)
  • setup_ollama (358-424)
🪛 LanguageTool
docs/OLLAMA_AUTO_SETUP_QUICKREF.md

[style] ~128-~128: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...om --- Last Updated: December 25, 2025 Status: ✅ Complete and Tested

(MISSING_COMMA_AFTER_YEAR)

🪛 markdownlint-cli2 (0.18.1)
docs/OLLAMA_AUTO_SETUP_IMPLEMENTATION.md

10-10: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


10-10: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


48-48: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


57-57: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


118-118: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


167-167: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


167-167: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


243-243: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


243-243: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


287-287: Bare URL used

(MD034, no-bare-urls)


288-288: Bare URL used

(MD034, no-bare-urls)


289-289: Bare URL used

(MD034, no-bare-urls)

docs/OLLAMA_AUTO_SETUP_QUICKREF.md

69-69: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


69-69: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


122-122: Bare URL used

(MD034, no-bare-urls)


123-123: Bare URL used

(MD034, no-bare-urls)


124-124: Bare URL used

(MD034, no-bare-urls)

docs/AUTOMATIC_OLLAMA_SETUP.md

45-45: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


96-96: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


119-119: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


211-211: Bare URL used

(MD034, no-bare-urls)


212-212: Bare URL used

(MD034, no-bare-urls)


213-213: Bare URL used

(MD034, no-bare-urls)

🪛 Shellcheck (0.11.0)
scripts/verify_ollama_setup.sh

[warning] 15-15: YELLOW appears unused. Verify use (or export if used externally).

(SC2034)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Package
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
🔇 Additional comments (20)
pyproject.toml (1)

10-10: LGTM! Valid license format update.

The change from string to table format (license = {text = "Apache-2.0"}) aligns with PEP 621 standards and resolves setuptools warnings, as noted in the implementation documentation.

README.md (2)

66-74: Excellent local-first messaging with clear benefits.

The updated features table effectively communicates the privacy-first, offline-capable, and zero-cost value proposition. The emphasis on Ollama as the default with cloud fallback is clear and compelling.


100-113: Verify automatic setup triggers on first use.

The documentation states that "setup will run automatically when you first use cortex" and that users will be "prompted to select which Ollama model to download." However, a PR comment from @Suyashd999 reports that "the ollama prompt doesn't pop up."

This suggests the automatic setup may not be triggering as documented. To properly verify this, the following need to be checked:

  • Whether setup.py uses post-install hooks (which typically fail with wheel installs) or runtime initialization
  • Whether the _post_install.py wiring is correct and properly triggered on first use
  • The actual entry point configuration and when initialization runs
docs/OLLAMA_AUTO_SETUP_IMPLEMENTATION.md (1)

1-304: Documentation is comprehensive and well-structured.

The implementation summary effectively documents the auto-setup feature, including architecture, flow diagrams, testing procedures, and troubleshooting guidance. This will be valuable for maintainers and contributors.

scripts/setup_ollama.py (2)

142-189: Good interactive UX with sensible defaults.

The model selection prompt is well-designed with clear options, size information, and a default choice. The error handling gracefully falls back to the default model on invalid input.


358-428: Verify interactive prompt behavior and marker file logic during installation.

The review concern about sys.stdin.isatty() (line 375) failing during pip install -e . is a real and documented issue. When post-install hooks run in non-TTY environments, isatty() correctly returns false, causing the interactive prompts to be skipped. This is expected behavior, not a bug.

However, to ensure the review comment addresses the actual implementation accurately, verify:

  1. Whether cortex/_post_install.py contains a marker file check that prevents re-runs after initial install
  2. Whether AGENTS.md (lines 20-31) mentions the python setup.py develop vs pip install -e . distinction for TTY preservation
  3. Whether post-install behavior is documented for users encountering skipped prompts

If users cannot run interactive setup after installation, consider:

  • Adding a cortex-setup-ollama standalone command that can be run manually from an interactive terminal (already suggested in code at line 393)
  • Documenting that the interactive prompt only works when running setup in a TTY environment
  • Supporting environment variables like OLLAMA_DEFAULT_MODEL=codellama:7b for non-interactive deployments (already suggested in review)
docs/AUTOMATIC_OLLAMA_SETUP.md (1)

1-213: Comprehensive documentation with clear troubleshooting guidance.

This document provides excellent coverage of the automatic setup feature, including architecture, flow diagrams, testing procedures, and troubleshooting steps. The environment variable reference and command documentation are particularly helpful.

docs/OLLAMA_AUTO_SETUP_QUICKREF.md (1)

1-129: Excellent quick reference with clear structure.

This quick reference effectively summarizes the auto-setup feature with practical examples, verification steps, and troubleshooting guidance. The table format and clear sections make it easy to scan and find information quickly.

tests/test_llm_router.py (3)

137-145: Test correctly updated to match new initialization behavior.

The test now expects RuntimeError during LLMRouter initialization when no providers are available, which aligns with the updated router logic. The error message assertion validates user-friendly guidance is provided.


423-514: Well-structured end-to-end tests with proper Ollama mocking.

These tests properly mock OllamaProvider to control availability and verify:

  1. Ollama-first routing when available (lines 423-466)
  2. Fallback cascade when Ollama fails (lines 468-514)

The mock setup is deterministic and validates the expected routing behavior.


773-786: Proper Ollama mock for rate limit test.

The test correctly mocks OllamaProvider to simulate Ollama availability, ensuring the rate limit semaphore test is deterministic.

scripts/verify_ollama_setup.sh (1)

18-105: Comprehensive verification script for Ollama setup integration.

The script provides thorough validation of the Ollama auto-setup workflow:

  • Project structure verification
  • Import testing
  • Skipped-mode execution testing
  • Integration test execution
  • User guidance for next steps
cortex/llm/interpreter.py (1)

144-187: Good refactor to use OllamaProvider for local LLM calls.

The _call_ollama method now properly delegates to OllamaProvider:

  • Ensures service is running with clear error messages
  • Selects best model with user guidance when unavailable
  • Uses the provider's complete() method for consistent behavior

The error messages (lines 155-170) provide actionable guidance for users.

tests/test_ollama_integration.py (2)

214-276: Good test coverage for Ollama installation scenarios.

The TestOllamaSetup class covers:

  • Successful installation with subprocess mocking
  • Skip behavior when already installed
  • Failure handling with non-zero return code

The mock configuration properly simulates the installation process.


186-211: Verify mock chain for Ollama completion test.

The test sets mock_ollama.complete.return_value with a dict, but the assertion at line 211 checks response.content.lower(). Ensure the router's _complete_ollama method properly extracts the content from the mock's return value.

If the test fails with a MagicMock in response.content, the mock setup may need adjustment to patch LLMRouter._complete_ollama directly instead of the provider.

#!/bin/bash
# Verify that test_complete_with_ollama passes
cd tests
python -m pytest test_ollama_integration.py::TestLLMRouter::test_complete_with_ollama -v 2>&1 | head -50
setup.py (1)

56-106: Well-designed idempotent post-install hook.

PostEggInfoCommand implements proper safeguards:

  • Marker file prevents duplicate setup runs
  • CI/non-TTY detection avoids blocking automated builds
  • Graceful handling of KeyboardInterrupt
  • Clear user guidance on manual setup
tests/test_ollama_setup_integration.py (1)

95-148: Test orchestration is well-structured.

The main() function provides:

  • Clear test execution with exception handling
  • Summary output with pass/fail counts
  • Actionable next steps on success
cortex/llm_router.py (3)

94-106: COSTS dictionary properly includes Ollama with zero cost.

The cost tracking correctly reflects that local Ollama inference is free.


148-174: Good defensive Ollama initialization with clear guidance.

The initialization:

  • Avoids delays with auto_pull=False
  • Provides helpful messages when service isn't running or models are missing
  • Gracefully handles initialization failures

301-344: Good infinite loop prevention in fallback logic.

The _attempted_providers set tracks which providers have been tried, preventing infinite recursion. The error message at lines 338-342 clearly lists available providers for debugging.

Comment on lines +11 to +12
def run_setup():
"""Run Ollama setup after installation."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add missing return type hint.

The function run_setup() is missing a return type hint. Per coding guidelines, type hints are required for all functions.

🔎 Proposed fix
-def run_setup():
+def run_setup() -> None:
     """Run Ollama setup after installation."""

As per coding guidelines, type hints are required in Python code.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def run_setup():
"""Run Ollama setup after installation."""
def run_setup() -> None:
"""Run Ollama setup after installation."""
🤖 Prompt for AI Agents
In cortex/_post_install.py around lines 11-12, the function signature for
run_setup() is missing a return type hint; update the signature to include an
explicit return type (e.g., -> None) so it follows the project's typing
guidelines, and ensure any related imports or type annotations are added if
required.

Comment on lines +229 to +251
# Find the matching closing bracket
if content.startswith("{"):
# Find matching }
brace_count = 0
for i, char in enumerate(content):
if char == "{":
brace_count += 1
elif char == "}":
brace_count -= 1
if brace_count == 0:
content = content[: i + 1]
break
elif content.startswith("["):
# Find matching ]
bracket_count = 0
for i, char in enumerate(content):
if char == "[":
bracket_count += 1
elif char == "]":
bracket_count -= 1
if bracket_count == 0:
content = content[: i + 1]
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Brace/bracket matching doesn't handle JSON strings containing literal braces.

The current logic counts all { and } characters, but this can incorrectly truncate JSON when string values contain literal braces (e.g., {"error": "Missing { in config"}).

For the typical command output format ({"commands": ["..."]}), this is unlikely to cause issues, but be aware of this edge case if parsing failures are reported.

🤖 Prompt for AI Agents
In cortex/llm/interpreter.py around lines 229-251, the brace/bracket matching
logic incorrectly counts braces/brackets that appear inside JSON string
literals; update the loop to track string state and escapes so braces/brackets
are only counted when not inside a string: introduce boolean flags (in_string,
escape) and toggle in_string on unescaped double quotes, toggle escape on
backslashes, and only increment/decrement brace_count or bracket_count when
in_string is False; apply the same string-aware logic to both the '{'...'}' and
'['...']' branches so literal braces inside JSON strings no longer break the
truncation.

Comment on lines 260 to 265
except (json.JSONDecodeError, ValueError) as e:
# Log the problematic content for debugging
import logging

logging.error(f"Failed to parse LLM response. Content: {content[:500]}")
raise ValueError(f"Failed to parse LLM response: {str(e)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Move logging import to module level.

The logging module is imported inside the exception handler. Per PEP 8, imports should be at the top of the file. This also avoids repeated import overhead if the exception is raised multiple times.

+import logging
 import json
 import os
 import sqlite3

And remove line 262:

         except (json.JSONDecodeError, ValueError) as e:
             # Log the problematic content for debugging
-            import logging
-
             logging.error(f"Failed to parse LLM response. Content: {content[:500]}")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In cortex/llm/interpreter.py around lines 260 to 265, the logging import is
inside the exception handler; remove the inline "import logging" on line 262 and
add "import logging" at the top of the module (with the other imports). Keep the
existing logging.error call as-is so the exception handler logs the truncated
content, and ensure there are no leftover local imports in that block.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

377-381: Resolve inconsistency between "In Progress" status and feature availability.

Line 380 lists "Ollama local model support" as unchecked (In Progress), but the rest of the README describes Ollama as the primary, available feature (e.g., lines 67, 103–105, 137, 181–212). This creates confusion about whether Ollama is production-ready or still experimental.

Please update the Project Status section to reflect the actual state of Ollama integration.

🧹 Nitpick comments (2)
README.md (2)

304-360: Add Ollama-specific troubleshooting guidance.

The Troubleshooting section covers cloud provider configuration and general Python setup but lacks any Ollama-specific diagnostics. Given that Ollama is now the default provider, users will likely encounter Ollama-related issues (e.g., service not running, model pull failures, connection errors).

Consider adding a subsection to help users verify Ollama installation, check service status, and debug model loading.

🔎 Suggested Ollama troubleshooting section
<details>
<summary><strong>"Ollama service not running"</strong></summary>

\`\`\`bash
# Check if Ollama is installed
ollama --version

# Start Ollama service (if not auto-starting)
ollama serve

# Verify it's running
curl http://localhost:11434

# Check available models
ollama list
\`\`\`
</details>

<details>
<summary><strong>"cortex-setup-ollama hangs or fails"</strong></summary>

\`\`\`bash
# Ensure Ollama service is running first
ollama serve &

# Then re-run setup
cortex-setup-ollama

# If still failing, check logs
journalctl -u ollama -n 50
\`\`\`
</details>

181-212: Clarify default model and first-run verification steps.

The Local LLM Support section provides good model recommendations and fallback explanation, but it doesn't clarify:

  • Which model is the default if the user doesn't explicitly choose during setup?
  • What happens if no models are installed when cortex install is first run?
  • How to verify Ollama connection is working before using Cortex.

Consider adding a quick checklist or note to guide first-time users through: install Ollama → start service → pull a model → verify connection → use Cortex.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4d35f1 and fe5a528.

📒 Files selected for processing (1)
  • README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build Package
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.12)

Comment on lines +103 to +105
# During installation, you'll be prompted to install Ollama (local AI)
# Choose 'y' for offline/private AI, or 'n' to use cloud providers
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find and examine the post-install hook file
find . -name "*post_install*" -type f | head -20

Repository: cortexlinux/cortex

Length of output: 87


🏁 Script executed:

# Get the exact content from lines 103-105 of README.md
sed -n '103,105p' README.md

Repository: cortexlinux/cortex

Length of output: 205


🏁 Script executed:

# Search for the setup.py or installation script to understand the installation flow
find . -name "setup.py" -o -name "setup.cfg" -o -name "pyproject.toml" | head -10

Repository: cortexlinux/cortex

Length of output: 89


🏁 Script executed:

# Search for any Ollama-related prompts or interactive setup code in the codebase
rg -i "ollama.*prompt|prompt.*ollama|you'll be prompted" --type py --type md

Repository: cortexlinux/cortex

Length of output: 740


🏁 Script executed:

# Examine the post-install hook implementation
cat -n cortex/_post_install.py

Repository: cortexlinux/cortex

Length of output: 1805


🏁 Script executed:

# Check setup.py to see how post-install is configured
cat setup.py

Repository: cortexlinux/cortex

Length of output: 5969


🏁 Script executed:

# Examine the first_run_wizard prompt logic
grep -A 10 -B 5 "Install Ollama now" cortex/first_run_wizard.py

Repository: cortexlinux/cortex

Length of output: 765


🏁 Script executed:

# Search for any issue references or PR discussions about the prompt
rg -i "suyashd|prompt.*doesnt|doesnt.*pop" --type md --type py --type txt

Repository: cortexlinux/cortex

Length of output: 44


🏁 Script executed:

# Examine the setup_ollama function to understand the full flow
cat -n scripts/setup_ollama.py | head -100

Repository: cortexlinux/cortex

Length of output: 3761


🏁 Script executed:

# Check if there are any GitHub issues or PR comments mentioning the prompt issue
git log --all --oneline --grep="ollama\|prompt" | head -20

Repository: cortexlinux/cortex

Length of output: 373


🏁 Script executed:

# Search for recent PRs or issues mentioning Ollama prompt issues
git log --all --oneline | grep -i "PR\|#\|ollama" | head -30

Repository: cortexlinux/cortex

Length of output: 2164


🏁 Script executed:

# Check if there are any discussion files or issue templates
find . -name "*.md" -type f | xargs grep -l "Suyashd999\|ollama prompt" 2>/dev/null

Repository: cortexlinux/cortex

Length of output: 46


🏁 Script executed:

# Look at the full setup_ollama.py to understand the complete flow
wc -l scripts/setup_ollama.py

Repository: cortexlinux/cortex

Length of output: 89


🏁 Script executed:

# Check what happens when setup_ollama is called - is there a wizard or direct setup?
grep -n "def setup_ollama\|def _prompt" scripts/setup_ollama.py | head -20

Repository: cortexlinux/cortex

Length of output: 85


🏁 Script executed:

# Read the complete setup_ollama function
sed -n '358,428p' scripts/setup_ollama.py

Repository: cortexlinux/cortex

Length of output: 2790


🏁 Script executed:

# Check if the wizard is being imported and called
grep -n "import.*wizard\|from.*wizard" scripts/setup_ollama.py

Repository: cortexlinux/cortex

Length of output: 44


🏁 Script executed:

# Look for any conditional logic that might skip the prompt
grep -n "isatty\|interactive\|TTY\|stdin" scripts/setup_ollama.py

Repository: cortexlinux/cortex

Length of output: 242


🏁 Script executed:

# Double-check README context around those lines to understand full installation instructions
sed -n '95,115p' README.md

Repository: cortexlinux/cortex

Length of output: 578


Document the interactive TTY requirement for Ollama setup prompt.

Lines 103-105 promise "you'll be prompted to install Ollama" during python setup.py develop, but the prompt only appears in interactive (TTY) mode. When run in non-interactive shells (piped input, CI/CD, some terminal emulators), the setup silently skips with no user interaction. This creates a gap between documented expectations and actual behavior.

Update the documentation to clarify the TTY requirement, or provide explicit instructions for non-interactive setups (e.g., use cortex-setup-ollama command or set environment variables to skip/customize Ollama installation).

🤖 Prompt for AI Agents
In README.md around lines 103 to 105, the doc claims "you'll be prompted to
install Ollama" during python setup but omits that the prompt only appears in an
interactive TTY; update the text to state the prompt requires an interactive
terminal and will be skipped in non-interactive environments (CI/piped shells),
and add a short note on alternatives: run the included cortex-setup-ollama
command for unattended installs or set environment variables/flags to skip or
preconfigure Ollama installation; keep the wording concise and add examples of
the two options (interactive vs non-interactive) so users know how to proceed in
CI.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarQube Cloud

@sujay-d07 sujay-d07 marked this pull request as draft December 26, 2025 10:33
@Suyashd999
Copy link
Collaborator

@sujay-d07 some modules from the requirements.txt aren't installed. Please run pip install -r requirements.txt under setup.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Ollama Integration - Local LLM Support

2 participants